Attorney Docket No. 034536/1034 



NUCLEIC ACIDS ENCODING PAK5 POLYPEPTIDES 
AND METHODS OF USE THEREOF 

RELATED APPLICATIONS 

[0001] This application is a divisional of U.S. Application Serial No. 09/688,188 
filed October 16, 2000, which is a divisional of U.S. Application Serial No. 09/291,417, 
filed April 13, 1999, which in turn claims priority to U.S. Provsional Application Serial 
No. 60/081,784, filed April 14, 1998. This application claims only subject matter 
disclosed in the parent application and therefore presents no new matter. 

[0002] The instant application contains a "lengthy" Sequence Listing which has 
been submitted via triplicate CD-R in lieu of a printed paper copy, and is hereby 
incorporated by reference in its entirety. Said CD-R are labeled "CRF", "Copy 1" and 
"Copy 2", respectively, and each contains only one identical 329 Kb file (3 8602329. APP). 

FIELD OF THE INVENTION 
[0003] The present invention relates to novel kinase polypeptides, nucleotide 
sequences encoding the novel kinase polypeptides, as well as various products and 
methods useful for the diagnosis and treatment of various kinase-related diseases and 
conditions. 

BACKGROUND OF THE INVENTION 
[0004] The following description of the background of the invention is provided to 
aid in understanding the invention, but is not admitted to be or to describe prior art to the 
invention. 

[0005] Cellular signal transduction is a fundamental mechanism whereby external 
stimuli that regulate diverse cellular processes are relayed to the interior of cells. One of 
the key biochemical mechanisms of signal transduction involves the reversible 
phosphorylation of proteins, which enables regulation of the activity of mature proteins by 
altering their structure and function. 

[0006] The best characterized protein kinases in eukaryotes phosphorylate proteins 
on the hydroxyl moiety of serine, threonine and tyrosine residues. These kinases largely 
fall into two groups, those specific for phosphorylating serines and threonines, and those 
specific for phosphorylating tyrosines. Some kinases, referred to as "dual specificity" 
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kinases, are able to phosphorylate on tyrosine as well as serine/threonine residues. 

[0007] Protein kinases can also be characterized by their location within the cell. 
Some kinases are transmembrane receptor-type proteins capable of directly altering their 
catalytic activity in response to the external environment such as the binding of a ligand. 
Others are non-receptor-type proteins lacking any transmembrane domain. They can be 
found in a variety of cellular compartments from the inner surface of the cell membrane to 
the nucleus. 

[0008] Many kinases are involved in regulatory cascades wherein their substrates 
may include other kinases whose activities are regulated by their phosphorylation state. 
Ultimately the activity of some downstream effector is modulated by phosphorylation 
resulting from activation of such a pathway. 

[0009] Protein kinases are one of the largest families of eukaryotic proteins with 
several hundred known members. These proteins share a 250-300 amino acid domain that 
can be subdivided into 12 distinct subdomains that comprise the common catalytic core 
structure. These conserved protein motifs have recently been exploited using PCR-based 
cloning strategies leading to a significant expansion of the known kinases. 

[0010] Multiple alignment of the sequences in the catalytic domain of protein 
kinases and subsequent parsimony analysis permits the segregation of related kinases into 
distinct branches or subfamilies including: tyrosine kinases, cyclic-nucleotide-dependent 
kinases, calcium/calmodulin kinases, cyclin-dependent kinases and MAP-kinases, serine- 
threonine kinase receptors, and several other less defined subfamilies. 

SUMMARY OF THE INVENTION 
[001 1] Through the use of a targeted PCR cloning strategy and of a "motif 
extraction" bioinformatics script, mammalian members of the STE20-kinase family have 
been identified as part of the present invention. Multiple alignment and parsimony 
analysis of the catalytic domain of all of these STE20-family members reveals that these 
proteins cluster into 9 distinct subgroups. Classification in this manner has proven highly 
accurate not only in predicting motifs present in the remaining non-catalytic portion of 
each protein, but also in their regulation, substrates, and signaling pathways. The present 
invention includes the partial or complete sequence of new members of the STE20-family, 
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their classification, predicted or deduced protein structure, and a strategy for elucidating 
their biologic and therapeutic relevance. 

[0012] Thus, a first aspect of the invention features an isolated, enriched, or 
purified nucleic acid molecule encoding a kinase polypeptide selected from the group 
consisting of STLK2, STLK3, STLK4, STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, 
KHS2, SULU1, SULU3, GEO, PAK4, and PAK5. 

[0013] By "isolated" in reference to nucleic acid is meant a polymer of nucleotides 
conjugated to each other, including DNA and RNA, that is isolated from a natural source 
or that is synthesized. The isolated nucleic acid of the present invention is unique in the 
sense that it is not found in a pure or separated state in nature. Use of the term "isolated" 
indicates that a naturally occurring sequence has been removed from its normal cellular 
{i.e., chromosomal) environment. Thus, the sequence may be in a cell-free solution or 
placed in a different cellular environment. The term does not imply that the sequence is 
the only nucleotide chain present, but that it is essentially free (about 90 - 95% pure at 
least) of non-nucleotide material naturally associated with it, and thus is distinguished 
from isolated chromosomes. 

[0014] By the use of the term "enriched" in reference to nucleic acid is meant that 
the specific DNA or RNA sequence constitutes a significantly higher fraction (2-5 fold) 
of the total DNA or RNA present in the cells or solution of interest than in normal or 
diseased cells or in the cells from which the sequence was taken. This could be caused by 
a person by preferential reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA sequence, or by a 
combination of the two. However, it should be noted that enriched does not imply that 
there are no other DNA or RNA sequences present, just that the relative amount of the 
sequence of interest has been significantly increased. The term "significant" is used to 
indicate that the level of increase is useful to the person making such an increase, and 
generally means an increase relative to other nucleic acids of about at least 2 fold, more 
preferably at least 5 to 10 fold or even more. The term also does not imply that there is no 
DNA or RNA from other sources. The other source DNA may, for example, comprise 
DNA from a yeast or bacterial genome, or a cloning vector such as pUC19. This term 
distinguishes from naturally occurring events, such as viral infection, or tumor type 
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growths, in which the level of one mRNA may be naturally increased relative to other 
species of mRNA. That is, the term is meant to cover only those situations in which a 
person has intervened to elevate the proportion of the desired nucleic acid. 

[0015] It is also advantageous for some purposes that a nucleotide sequence be in 
purified form. The term "purified" in reference to nucleic acid does not require absolute 
purity (such as a homogeneous preparation). Instead, it represents an indication that the 
sequence is relatively more pure than in the natural environment (compared to the natural 
level this level should be at least 2-5 fold greater, e.g., in terms of mg/mL). Individual 
clones isolated from a cDNA library may be purified to electrophoretic homogeneity. The 
claimed DNA molecules obtained from these clones could be obtained directly from total 
DNA or from total RNA. The cDNA clones are not naturally occurring, but rather are 
preferably obtained via manipulation of a partially purified naturally occurring substance 
(messenger RNA). The construction of a cDNA library from mRNA involves the creation 
of a synthetic substance (cDNA) and pure individual cDNA clones can be isolated from 
the synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the 
process which includes the construction of a cDNA library from mRNA and isolation of 
distinct cDNA clones yields an approximately 10 6 -fold purification of the native message. 
Thus, purification of at least one order of magnitude, preferably two or three orders, and 
more preferably fourjor five orders of magnitude is expressly contemplated. 

[0016] By a "kinase polypeptide" is meant 32 (preferably 40, more preferably 45, 
most preferably 55) or more contiguous amino acids set forth in the amino acid sequence 
of SEQ ED NO:5, SEQ ID NO:6, or SEQ ID NO:7, or the corresponding full-length amino 
acid sequence; 250 (preferably 255, more preferably 260, most preferably 270) or more 
contiguous amino acids set forth in the amino acid sequence SEQ ID NO: 13, SEQ ID 
NO:14, SEQ ID NO:15, or SEQ ID NO:105, or the corresponding full-length amino acid 
sequence; 27 (preferably 30, more preferably 40, most preferably 45) or more contiguous 
amino acids set forth in the amino acid sequence SEQ ID NO: 18; 16 (preferably 20, more 
preferably 25, most preferably 35) or more contiguous amino acids set forth in the amino 
acid sequence SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID 
NO:31, or SEQ ID NO: 103 or the corresponding full-length amino acid sequence; 6 
(preferably 10, more preferably 15, most preferably 25) or more contiguous amino acids 
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set forth in the amino acid sequence of SEQ ID NO:97 or SEQ ID NO:99, 22 (preferably 
30, more preferably 35, most preferably 45) or more contiguous amino acids set forth in 
the amino acid sequence of SEQ ID NO: 101, or the corresponding full-length amino acid 
sequence; 78 (preferably 80, more preferably 85, most preferably 90) or more contiguous 
amino acids set forth in the amino acid sequence SEQ ID NO: 107 or functional derivatives 
thereof as described herein. For sequences for which the full-length sequence is not given, 
the remaining sequences can be determined using methods well-known to those in the art 
and are intended to be included in the invention. In certain aspects, polypeptides of 100, 
200, 300 or more amino acids are preferred. The kinase polypeptide can be encoded by a 
full-length nucleic acid sequence or any portion of the full-length nucleic acid sequence, so 
long as a functional activity of the polypeptide is retained, not to include fragments 
containing only amino acids 1-22 of SEQ ID NO: 13 or only amino acids 1-33 of SEQ ID 
NO: 107. 

[0017] The amino acid sequence will be substantially similar to the sequence 
shown in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:13, SEQ ID NO:14, 
SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:100, SEQ ID 
NO: 103, SEQ ID NO: 105, or SEQ ID NO: 107, or the corresponding full-length amino 
acid sequence, or fragments thereof, not to include fragments consisting only of the amino 
acid sequences 1-22 of SEQ ID NO:13 or 1-33 of SEQ ID NO:107. A sequence that is 
substantially similar to the sequence of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ 
ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:22, SEQ ID 
NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, 
SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ID NO:107 will preferably 
have at least 90% identity (more preferably at least 95% and most preferably 99-100%) to 
the sequence of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:13, SEQ ID 
NO: 14, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO: 101, SEQ 
ID NO:103, SEQ ID NO:105, or SEQ ID NO:107. 

[0018] By "identity" is meant a property of sequences that measures their similarity 
or relationship. Identity is measured by dividing the number of identical residues by the 
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total number of residues and gaps and multiplying the product by 100. "Gaps" are spaces 
in an alignment that are the result of additons or deletions of amino acids. Thus, two 
copies of exactly the same sequence have 100% identity, but sequences that are less highly 
conserved, and have deletions, additions, or replacements, may have a lower degree of 
identity. Those skilled in the art will recognize that several computer programs are 
available for determining sequence identity using standard parameters, for example Blast 
(Altschul, et al. (1997) Nucleic Acids Res. 25:3389-3402), Blast2 (Altschul, et al. (1990) 
J. mol. biol. 215:403-410), and Smith-Waterman (Smith, et al. (1981) J. Mol. Biol. 
147:195-197). 

[0019] In preferred embodiments, the invention features isolated, enriched, or 
purified nucleic acid molecules encoding a kinase polypeptide comprising a nucleotide 
sequence that: (a) encodes a polypeptide having the amino acid sequence set forth in SEQ 
ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, 
SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID 
NO:31, SEQIDNO:97, SEQ ID NO:99, SEQIDNO:101, SEQIDNO:103, SEQ ID 
NO:105, or SEQ ID NO:107; (b) is the complement of the nucleotide sequence of (a); (c) 
hybridizes under highly stringent conditions to the nucleotide molecule of (a) and encodes 
a naturally occurring kinase polypeptide; (d) encodes a kinase polypeptide having the 
amino acid sequence of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO: 13, 
SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID 
NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO: 
103, SEQ ID NO: 105, or SEQ ID NO: 107, except that it lacks one or more, but not all, of 
the following segments of amino acid residues: 1-21, 22-274, or 275-416 of SEQ ID NO:5, 
1-31, 32-308, 309-489 or 490-516 of SEQ ED NO:6, 1-178 or 179-414 of SEQ ID NO:7, 1- 
22, 23-289, 290-526, 527-640, 641-896, or 897-1239 of SEQ ID NO: 13, 1-255, 256-442, 
443-626, 627-954, or 955-1297 of SEQ ID NO:14, 1-255, 256-476, 477-680, 681-983, or 
984-1326 of SEQ ID NO:15, 1-13, 14-273, 274-346, 347-534, or 535-894 of SEQ ID 
NO:18, 1-21, 22-277, 278-427, 428-637, 638-751, or 752-898 of SEQ ID NO:22, 1-66, 67- 
215, 216-425, 426-539, 540-786, or 787-887 of SEQ ID NO:23, 1-25, 26-273, 274-422, 
423-632, or 633-748 of SEQ ID NO:24, 1-51, 52-224, 225-393, 394-658, or 659-681 of 
SEQ ID NO:29, 1-25, 26-281, 284-430, 431-640, 641-754, 755-901, or 902-1001 of SEQ 
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total number of residues and gaps and multiplying the product by 100. "Gaps" are spaces 
in an alignment that are the result of additons or deletions of amino acids. Thus, two 
copies of exactly the same sequence have 100% identity, but sequences that are less highly 
conserved, and have deletions, additions, or replacements, may have a lower degree of 
identity. Those skilled in the art will recognize that several computer programs are 
available for determining sequence identity using standard parameters, for example Blast 
(Altschul, et al. (1997) Nucleic Acids Res. 25:3389-3402), Blast2 (Altschul, et al. (1990) 
J. mol. biol. 215:403-410), and Smith- Waterman (Smith, et al. (1981) J. Mol. Biol. 
147:195-197). 

[0019] In preferred embodiments, the invention features isolated, enriched, or 
purified nucleic acid molecules encoding a kinase polypeptide comprising a nucleotide 
sequence that: (a) encodes a polypeptide having the amino acid sequence set forth in SEQ 
ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, 
SEQ ID NO: 18, SEQ ID NO:22, SEQ ED NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID 
NO: 105, or SEQ ID NO: 107; (b) is the complement of the nucleotide sequence of (a); (c) 
hybridizes under highly stringent conditions to the nucleotide molecule of (a) and encodes 
a naturally occurring kinase polypeptide; (d) encodes a kinase polypeptide having the 
amino acid sequence of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:13, 
SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID 
NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ED NO:97, SEQ ID NO:99, SEQ ID NO: 
103, SEQ ID NO: 105, or SEQ ID NO: 107, except that it lacks one or more, but not all, of 
the following segments of amino acid residues: 1-21, 22-274, or 275-416 of SEQ ID NO:5, 
1-31, 32-308, 309-489 or 490-516 of SEQ ID NO:6, 1-178 or 179-414 of SEQ ID NO:7, 1- 
22, 23-289, 290-526, 527-640, 641-896, or 897-1239 of SEQ ID NO:13, 1-255, 256-442, 
443-626, 627-954, or 955-1297 of SEQ ID NO: 14, 1-255, 256-476, 477-680, 681-983, or 
984-1326 of SEQ ID NO:15, 1-13, 14-273, 274-346, 347-534, or 535-894 of SEQ ID 
NO: 18, 1-21, 22-277, 278-427, 428-637, 638-751, or 752-898 of SEQ ED NO:22, 1-66, 67- 
215, 216-425, 426-539, 540-786, or 787-887 of SEQ ID NO:23, 1-25, 26-273, 274-422, 
423-632, or 633-748 of SEQ ED NO:24, 1-51, 52-224, 225-393, 394-658, or 659-681 of 
SEQ ID NO:29, 1-25, 26-281, 284-430, 431-640, 641-754, 755-901, or 902-1001 of SEQ 



6 



Attorney Docket No. 034536/1034 



IDNO:31, 1-10, 11-321, or 322-373 of SEQ ID NO:97, 1-57, 58-369, or 370-418 of SEQ 
ID NO:99, 1-52, 53-173, 174-307, 308-572, or 573-591 of SEQ ID NO: 103, 1-24, 25-289, 
290-397, 398-628, 629-872, or 873-1227 of SEQ ID NO:105, or 1-33, 34-294, 295-337, 
338-472, 473-724, or 725-968 of SEQ ID NO:107; (e) is the complement of the nucleotide 
sequence of (d); (f) encodes a polypeptide having the amino acid sequence set forth in 
SEQ ID NO:5, SEQ ID NO:6, SEQ ED NO:7, SEQ ID NO:13, SEQ ID NO:14, SEQ ID 
NO: 15, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, 
SEQ ID NO:31; SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO: 103, SEQ ID NO: 105, or 
SEQ ID NO:107 from amino acid residues 1-21, 22-274, or 275-416 of SEQ ID NO:5, 1- 
31, 32-308, 309-489, or 490-516 of SEQ ID NO:6, 1-178 or 179-414 of SEQ ED NO:7, 23- 
289, 290-526, 527-640, 641-896, or 897-1239 of SEQ ED NO:13, 1-255, 256-442, 443- 
626, 627-954, or 955-1297 of SEQ ED NO:14, 1-255, 256-476, 477-680, 681-983, or 984- 
1326 of SEQ ED NO:15, 1-13, 14-273, 274-346, 347-534, or 535-894 of SEQ ED NO:18, 
1-21, 22-277, 278-427, 428-637, 638-751, or 752-898 of SEQ ED NO:22, 1-66, 67-215, 
216-425, 426-539, 540-786, or 787-887 of SEQ ID NO:23, 1-25, 26-273, 274-422, 423- 
632, or 633-748 of SEQ ED NO:24, 1-51, 52-224, 225-393, 394-658, or 659-681 of SEQ 
ID NO:29, 1-25, 26-281, 282-430, 431-640, 641-754, 755-901, or 902-1001 of SEQ ED 
NO:31, 1-10, 11-321, or 322-373 of SEQ ID NO:97, 1-57, 58-369, or 370-418 of SEQ ID 
NO:99, 1-52, 53-173, 174-307, 308-572, or 573-591 of SEQ ED NO:103, 1-24, 25-289, 
290-397, 398-628, 629-872, or 873-1227 of SEQ ED NO:105,or 1-33, 34-294, 295-337, 
338-472, 473-724, or 725-968 of SEQ ED NO:107; (g) is the complement of the nucleotide 
sequence of (f);(h) encodes a polypeptide having the amino acid sequence set forth in SEQ 
ED NO:5, SEQ ED NO:6, SEQ ID NO:7, SEQ ED NO: 13, SEQ ID NO: 14, SEQ ED NO: 15, 
SEQ ID NO: 18, SEQ ED NO:22, SEQ ED NO:23, SEQ ID NO:24, SEQ ED NO:29, SEQ ED 
NO:31, SEQ ED NO:97, SEQ ED NO:99, SEQ ED NO: 101, SEQ ID NO: 103, SEQ ED 
NO: 105, or SEQ ED NO: 107, except that it lacks one or more of the domains selected from 
the group consisting of a N-terminal domain, a catalytic domain, a C-terminal domain, a 
coiled-coil structure region, a proline-rich region, a spacer region, an insert, and a C- 
terminal tail; or (i) is the complement of the nucleotide sequence of (h). 

[0020] The term "complement" refers to two nucleotides that can form multiple 
favorable interactions with one another. For example, adenine is complementary to 
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thymine as they can form two hydrogen bonds. Similarly, guanine and cytosine are 
complementary since they can form three hydrogen bonds. A nucleotide sequence is the 
complement of another nucleotide sequence if all of the nucleotides of the first sequence 
are complementary to all of the nucleotides of the second sequence. 

[0021] The term "domain" refers to a region of a polypeptide which contains a 
particular function. For instance, N-terminal or C-terminal domains of signal transduction 
proteins can serve functions including, but not limited to, binding molecules that localize 
the signal transduction molecule to different regions of the cell or binding other signaling 
molecules directly responsible for propagating a particular cellular signal. Some domains 
can be expressed separately from the rest of the protein and function by themselves, while 
others must remain part of the intact protein to retain function. The latter are termed 
functional regions of proteins and also relate to domains. 

[0022] The term "N-terminal domain" refers to the extracatalytic region located 
between the initiator methionine and the catalytic domain of the protein kinase. The N- 
terminal domain can be identified following a Smith- Waterman alignment of the protein 
sequence against the non-redundant protein database to define the N-terminal boundary of 
the catalytic domain. Depending on its length, the N-terminal domain may or may not play 
a regulatory role in kinase function. An example of a protein kinase whose N-terminal 
domain has been shown to play a regulatory role is PAK65, which contains a CRIB motif 
used for Cdc42 and rac binding (Burbelo, P.D. et al. (1995) J. Biol. Chem. 270, 29071- 
290740). 

[0023] The N-terminal domain spans amino acid residues 1-21 of the sequence set 
forth in SEQ ID NO: 5, amino acid residues 1-31 of the sequence set forth in SEQ ID 
NO:6, amino acid residues 1-22 of the sequence set forth in SEQ ID NO: 13, amino acid 
residues 1-13 of the sequence set forth in SEQ ID NO: 18, amino acid residues 1-21 of the 
sequence set forth in SEQ ID NO:22, amino acid residues 1-25 of the sequence set forth in 
SEQ ID NO:24, amino acid residues 1-51 of the sequence set forth in SEQ ID NO:29, 
amino acid residues 1-25 of the sequence set forth in SEQ ID NO:31, amino acid residues 
1-57 of the sequence set forth in SEQ ID NO:99, amino acid residues 1-52 of the sequence 
set forth in SEQ ID NO: 103, amino acid residues 1-24 of the sequence set forth in SEQ ID 
NO: 105, or amino acid residues 1-33 of the sequence set forth in SEQ ID NO: 107. 
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[0024] The term "catalytic domain" refers to a region of the protein kinase that is 
typically 25-300 amino acids long and is responsible for carrying out the phosphate 
transfer reaction from a high-energy phosphate donor molecule such as ATP or GTP to 
itself (autophosphorylation) or to other proteins (exogenous phosphorylation). The 
catalytic domain of protein kinases is made up of 12 subdomains that contain highly 
conserved amino acid residues, and are responsible for proper polypeptide folding and for 
catalysis. The catalytic domain can be identified following a Smith- Waterman alignment 
of the protein sequence against the non-redundant protein database. 

[0025] The catalytic domain spans amino acid residues 22-274 of the sequence set 
forth in SEQ ID NO:5, residues 32-308 of the sequence set forth in SEQ ID NO:6, residues 
1-178 of the sequence set forth in SEQ ID NO:7, residues 23-289 of the sequence set forth 
in SEQ ED NO: 13, residues 1-255 of the sequence set forth in SEQ ED NO: 14, residues 1- 
255 of the sequence set forth in SEQ ED NO: 15, residues 14-273 of the sequence set forth 
in SEQ ED NO:18, residues 22-277 of the sequence set forth in SEQ ID NO:22, residues 1- 
66 of the sequence set forth in SEQ ED NO:23, residues 26-273 of the sequence set forth in 
SEQ ID NO:24, residues 394-658 of the sequence set forth in SEQ ED NO:29, residues 26- 
281 of the sequence set forth in SEQ ID NO:31, residues 1-278 of the sequence set forth in 
SEQ ID NO:97, residues 58-369 of the sequence set forth in SEQ ID NO:99, residues 1- 
103 of the sequence set forth in SEQ ID NO: 101, residues 308-572 of the sequence set 
forth in SEQ ID NO: 103, residues 25-289 of the sequence set forth in SEQ ID NO: 105, or 
residues 34-294 of the sequence set forth in SEQ ID NO: 107. 

[0026] The term "catalytic activity", as used herein, defines the rate at which a 
kinase catalytic domain phosphorylates a substrate. Catalytic activity can be measured, for 
example, by determining the amount of a substrate converted to a phosphorylated product 
as a function of time. Catalytic activity can be measured by methods of the invention by 
holding time constant and determining the concentration of a phosphorylated substrate 
after a fixed period of time. Phosphorylation of a substrate occurs at the active-site of a 
protein kinase. The active-site is normally a cavity in which the substrate binds to the 
protein kinase and is phosphorylated. 

[0027] The term "substrate" as used herein refers to a molecule phosphorylated by 
a kinase of the invention. Kinases phosphorylate substrates on serine/threonine or tyrosine 
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amino acids. The molecule may be another protein or a polypeptide. 

[0028] The term "C-terminal domain" refers to the region located between the 
catalytic domain or the last (located closest to the C-terminus) functional domain and the 
carboxy-terminal amino acid residue of the protein kinase. By "functional" domain is 
meant any region of the polypeptide that may play a regulatory or catalytic role as 
predicted from amino acid sequence homology to other proteins or by the presence of 
amino acid sequences that may give rise to specific structural conformations (i.e. coiled- 
coils). The C-terminal domain can be identified by using a Smith- Waterman alignment of 
the protein sequence against the non-redundant protein database to define the C-terminal 
boundary of the catalytic domain or of any functional C-terminal extracatalytic domain. 
Depending on its length and amino acid composition, the C-terminal domain may or may 
not play a regulatory role in kinase function. An example of a protein kinase whose C- 
terminal domain may play a regulatory role is PAK3 which contains a heterotrimeric G b 
subunit-binding site near its C-terminus (Leeuw, T. et al (1998) Nature, 391, 191-195). 

[0029] The C-terminal domain spans amino acid residues 275-416 of the sequence 
set forth in SEQ ID NO:5, residues 309-489 of the sequence set forth in SEQ ID NO:6, 
residues 179-414 of the sequence set forth in SEQ ID NO:7, residues 897-1239 of the 
sequence set forth in SEQ ID NO: 13, residues 955-1297 of the sequence set forth in SEQ 
ID NO:14, residues 984-1326 of the sequence set forth in SEQ ID NO:15, residues 535- 
894 of the sequence set forth in SEQ ID NO: 18, residues 752-898 of the sequence set forth 
in SEQ ID NO:22, residues 279-330 of the sequence set forth in SEQ ID NO:97, residues 
370-418 of the sequence set forth in SEQ ID NO:99, or residues 873-1227 of the sequence 
set forth in SEQ ID NO: 105. 

[0030] The term "signal transduction pathway" refers to the molecules that 
propagate an extracellular signal through the cell membrane to become an intracellular 
signal. This signal can then stimulate a cellular response. The polypeptide molecules 
involved in signal transduction processes are typically receptor and non-receptor protein 
tyrosine kinases, receptor and non-receptor protein phosphatases, SRC homology 2 and 3 
domains, phosphotyrosine binding proteins (SRC homology 2 (SH2) and phosphotyrosine 
binding (PTB and PH) domain containing proteins), proline-rich binding proteins (SH3 
domain containing proteins), nucleotide exchange factors, and transcription factors. 
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[0031] The term "coiled-coil structure region" as used herein, refers to a 
polypeptide sequence that has a high probability of adopting a coiled-coil structure as 
predicted by computer algorithms such as COILS (Lupas, A. (1996) Meth. Enzymology 
266:513-525). Coiled-coils are formed by two or three amphipathic □ -helices in parallel. 
Coiled-coils can bind to coiled-coil domains of other polypeptides resulting in homo- or 
heterodimers (Lupas, A. (1991) Science 252:1 162-1 164). Coiled-coil-dependent 
oligomerization has been shown to be necessary for protein function including catalytic 
activity of serine/threonine kinases (Roe, J. et al. (1997) J. Biol. Chem. 272:5838-5845). 

[0032] The coiled-coil structure region spans amino acid residues 290-526 of the 
sequence set forth in SEQ ID NO: 13, residues 256-442 of the sequence set forth in SEQ ID 
NO:14, residues 256-476 of the sequence set forth in SEQ ID NO:15, residues 428-637 of 
the sequence set forth in SEQ ID NO:22, residues 216-425 or 540-786 of the sequence set 
forth in SEQ ID NO:23, residues 423-632 of the sequence set forth in SEQ ID NO:24, 
residues 431-640 or 755-901 of the sequence set forth in SEQ ID NO:31, residues 291-398 
or 629-668 of the sequence set forth in SEQ ID NO: 105, or residues 473-724 or 725-968 
of the sequence set forth in SEQ ID NO:107. 

[0033] The term "pro line-rich region" as used herein, refers to a region of a protein 
kinase whose proline content over a given amino acid length is higher than the average 
content of this amino acid found in proteins(z*.e., >10%). Proline-rich regions are easily 
discernable by visual inspection of amino acid sequences and quantitated by standard 
computer sequence analysis programs such as the DNAStar program EditSeq. Proline-rich 
regions have been demonstrated to participate in regulatory protein -protein interactions. 
Among these interactions, those that are most relevant to this invention involve the "PxxP" 
(SEQ ID NO: 148) proline rich motif found in certain protein kinases {i.e., human PAK1) 
and the SH3 domain of the adaptor molecule Nek (Galisteo, M.L. et al. (1996) J. Biol. 
Chem. 271 :20997-2 1000). Other regulatory interactions involving "PxxP" (SEQ ID NO: 
148) proline-rich motifs include the WW domain (Sudol, M. (1996) Prog. Biochys. Mol. 
Bio. 65:113-132). 

[0034] The proline-rich region spans amino acid residues 527-640 of the sequence 
set forth in SEQ ID NO: 13, residues 443-626 of the sequence set forth in SEQ ID NO: 14, 
residues 477-680 of the sequence set forth in SEQ ID NO: 15, residues 347-534 of the 
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sequence set forth in SEQ ID NO:18,residues 398-628 of the sequence set forth in SEQ ID 
NO:105, or residues 338-472 of the sequence set forth in SEQ ID NO:107. 

[0035] The term "spacer region" as used herein, refers to a region of the protein 
kinase located between predicted functional domains. The spacer region has no detectable 
homology to any amino acid sequence in the database, and can be identified by using a 
Smith- Waterman alignment of the protein sequence against the non-redundant protein 
database to define the C- and N-terminal boundaries of the flanking functional domains. 
Spacer regions may or may not play a fundamental role in protein kinase function. 
Precedence for the regulatory role of spacer regions in kinase function is provided by the 
role of the src kinase spacer in inter-domain interactions (Xu, W. et al. (1997) Nature 
385:595-602). 

[0036] The spacer region spans amino acid residues 641-896 of the sequence set 
forth in SEQ ID NO: 13, residues 627-954 of the sequence set forth in SEQ ID NO: 14, 
residues 681-983 of the sequence set forth in SEQ ID NO: 15, residues 274-346 of the 
sequence set forth in SEQ ID NO: 18, residues 278-427 or 638-751 of the sequence set 
forth in SEQ ID NO:22, residues 67-215 or 426-539 of the sequence set forth in SEQ ID 
NO:23, residues 274-422 or 633-748 of the sequence set forth in SEQ ID NO:24, residues 
225-393 of the sequence set forth in SEQ ID NO:29, residues 282-430 or 641-754 of the 
sequence set forth in SEQ ID NO:31, residues 174-307 of the sequence set forth in SEQ ID 
NO: 103, residues 669-872 of the sequence set forth in SEQ ID NO: 105, or residues 295- 
337 of the sequence set forth in SEQ ID NO:107. 

[0037] The term "insert" as used herein refers to a portion of a protein kinase that 
is absent from a close homolog. Inserts may or may not by the product alternative splicing 
of exons. Inserts can be identified by using a Smith- Waterman sequence alignment of the 
protein sequence against the non-redundant protein database, or by means of a multiple 
sequence alignment of homologous sequences using the DNAStar program Megalign. 
Inserts may play a functional role by presenting a new interface for protein-protein 
interactions, or by interfering with such interactions. Inserts span amino acid residues 52- 
224 of the sequence set forth in SEQ ID NO:29 or residues 53-173 of the sequence set 
forth in SEQ ID NO: 103. 

[0038] The term "C-terminal tail" as used herein, refers to a C-terminal domain of 
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a protein kinase, that by homology extends or protrudes past the C-terminal amino acid of 
its closest homolog. C-terminal tails can be identified by using a Smith-Waterman 
sequence alignment of the protein sequence against the non-redundant protein database, or 
by means of a multiple sequence alignment of homologous sequences using the DNAStar 
program Megalign. Depending on its length, a C-terminal tail may or may not play a 
regulatory role in kinase function. 

[0039] The C-terminal tail spans amino acid residues 490-516 of the sequence set 
forth in SEQ ID NO:6, residues 787-887 of the sequence set forth in SEQ ID NO:23, 
residues 659-681 of the sequence set forth in SEQ ID NO:29, residues 994-1093 of the 
sequence set forth in SEQ ID NO:31, or residues 573-591 of the sequence set forth in SEQ 
ED NO: 103. 

[0040] Various low or high stringency hybridization conditions may be used 
depending upon the specificity and selectivity desired. These conditions are well-known 
to those skilled in the art. Under stringent hybridization conditions only highly 
complementary nucleic acid sequences hybridize. Preferably, such conditions prevent 
hybridization of nucleic acids having more than 1 or 2 mismatches out of 20 contiguous 
nucleotides, more preferably, such conditions prevent hybridization of nucleic acids 
having more than 1 or 2 mismatches out of 50 contiguous nucleotides, most preferably, 
such conditions prevent hybridization of nucleic acids having more than 1 or 2 mismatches 
out of 100 contiguous nucleotides. In some instances, the conditions may prevent 
hybridization of nucleic acids having more than 5 mismatches in the full-length sequence. 

[0041] By stringent hybridization assay conditions is meant hybridization assay 
conditions at least as stringent as the following: hybridization in 50% formamide, 5X SSC, 
50 mM NaH 2 PQ 4 , pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X 
Denhart solution at 42 °C overnight; washing with 2X SSC, 0.1% SDS at 45 DC; and 
washing with 0.2X SSC, 0.1% SDS at 45 °C. Under some of the most stringent 
hybridization assay conditions, the second wash can be done with 0.1X SSC at a 
temperature up to 70 °C (Berger et al. (1987) Guide to Molecular Cloning Techniques pg 
421, hereby incorporated by reference herein including any figures, tables, or drawings.). 
However, other applications may require the use of conditions falling between these sets of 
conditions. Methods of determining the conditions required to achieve desired 
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hybridizations are well-known to those with ordinary skill in the art, and are based on 
several factors, including but not limited to, the sequences to be hybridized and the 
samples to be tested. 

[0042] In other preferred embodiments, the invention features isolated, enriched, or 
purified nucleic acid molecules encoding kinase polypeptides, further comprising a vector 
or promoter effective to initiate transcription in a host cell. The invention also features 
recombinant nucleic acid, preferably in a cell or an organism. The recombinant nucleic 
acid may contain a sequence set forth in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:9, SEQ ID NO: 10, SEQ ID NO: 1 1, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID 
NO:20, SEQ ID NO:21, SEQ ID NO:27, SEQ ID NO:96, SEQ ID NO:98, SEQ ID 
NO:100, SEQ ID NO:102,SEQ ID NO:104, or SEQ ID NO:106, or a functional derivative 
thereof and a vector or a promoter effective to initiate transcription in a host cell. The 
recombinant nucleic acid can alternatively contain a transcriptional initiation region 
functional in a cell, a sequence complementary to an RNA sequence encoding a kinase 
polypeptide and a transcriptional termination region functional in a cell. Specific vectors 
and host cell combinations are discussed herein. 

[0043] The term "vector" relates to a single or double-stranded circular nucleic 
acid molecule that can be transfected into cells and replicated within or independently of a 
cell genome. A circular double-stranded nucleic acid molecule can be cut and thereby 
linearized upon treatment with restriction enzymes. An assortment of nucleic acid vectors, 
restriction enzymes, and the knowledge of the nucleotide sequences cut by restriction 
enzymes are readily available to those skilled in the art. A nucleic acid molecule encoding 
a kinase can be inserted into a vector by cutting the vector with restriction enzymes and 
ligating the two pieces together. 

[0044] The term "transfecting" defines a number of methods to insert a nucleic 
acid vector or other nucleic acid molecules into a cellular organism. These methods 
involve a variety of techniques, such as treating the cells with high concentrations of salt, 
an electric field, detergent, or DMSO to render the outer membrane or wall of the cells 
permeable to nucleic acid molecules of interest or use of various viral transduction 
strategies. 

[0045] The term "promoter" as used herein, refers to nucleic acid sequence needed 
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for gene sequence expression. Promoter regions vary from organism to organism, but are 
well known to persons skilled in the art for different organisms. For example, in 
prokaryotes, the promoter region contains both the promoter (which directs the initiation 
of RNA transcription) as well as the DNA sequences which, when transcribed into RNA, 
will signal synthesis initiation. Such regions will normally include those 5'-non-coding 
sequences involved with initiation of transcription and translation, such as the TATA box, 
capping sequence, CAAT sequence, and the like. 

[0046] In preferred embodiments, the isolated nucleic acid comprises, consists 
essentially of, or consists of a nucleic acid sequence set forth in SEQ ID NO:l, SEQ ID 
NO:2, SEQ ID NO:3, SEQ ID NO:9, SEQ ID NO: 10, SEQ ID NO:l 1, SEQ ID NO: 17, 
SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:27, SEQ ID NO:96, SEQ ID 
NO:98, SEQ ID NO: 100 SEQ ID NO: 102, SEQ ID NO: 104, or SEQ ID NO: 106, or the 
corresponding full-length sequence, encodes the amino acid sequence of SEQ ID NO:5, 
SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID 
NO:18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, 
SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, or 
SEQ ID NO: 107, or the corresponding full-length amino acid sequence, a functional 
derivative thereof, or at least 40, 45, 50, 60, 100, 200, or 300 contiguous amino acids of 
SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7, or of the corresponding full-length amino 
acid sequence; at least 250, 255, 275, 300, or 400 contiguous amino acids of SEQ ED 
NO:13, SEQ ID NO:14, SEQ ID NO:15, or of the corresponding full-length amino acid 
sequence; at least 27, 30, 35, 40, 50, 100, 200, or 300 contiguous amino acids of SEQ ID 
NO:18; at least 16, 25, 35, 50, 100, 200, or 300 contiguous amino acids of SEQ ID, NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, or SEQ ED NO: 103, or 
of the corresponding full-length amino acid sequence; 6 (preferably 10, more preferably 
15, most preferably 25) or more contiguous amino acids set forth in the amino acid 
sequence of SEQ ID NO:97 or SEQ ID NO:99, or the corresponding full-length amino 
acid sequence; 22 (preferably 30, more preferably 35, most preferably 45) or more 
contiguous amino acids set forth in the amino acid sequence of SEQ ED NO: 101, or the 
corresponding full-length amino acid sequence; or at least 80, 85, 90, 100, 200, or 300 
contiguous amino acids of SEQ ID NO: 107, or functional derivatives thereof. The kinase 
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polypeptides, selected from the group consisting of STLK2, STLK3, STLK4, STLK5, 
STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, and 
PAK5, comprise, consist essentially of, or consist of at least at least 40, 45, 50, 60, 100, 
200, or 300 contiguous amino acids of SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7; at 
least 250, 255, 275, 300, or 400 contiguous amino acids of SEQ ID NO: 13, SEQ ID 
NO:14, SEQ ID NO:15, or SEQ ID NO:105; at least 27, 30, 35, 40, 50, 100, 200, or 300 
contiguous amino acids of SEQ ID NO:18; at least 35, 40, 45, 50, 100, 200, or 300 
contiguous amino acids of SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:29, SEQ ID NO:31 or SEQ ID NO: 103; 6 (preferably 10, more preferably 15, most 
preferably 25) or more contiguous amino acids set forth in the amino acid sequence of 
SEQ ID NO:97 or SEQ ID NO:99; 22 (preferably 30, more preferably 35, most preferably 
45) or more contiguous amino acids set forth in the amino acid sequence of SEQ ID 
NO: 101; or at least 80, 85, 90, 100, 200, or 300 contiguous amino acids of SEQ ID 
NO: 107, or the corresponding full-length sequences or derivatives thereof The nucleic 
acid may be isolated from a natural source by cDNA cloning or by subtractive 
hybridization. The natural source may be mammalian, preferably human, blood, semen, or 
tissue, and the nucleic acid may be synthesized by the triester method or by using an 
automated DNA synthesizer. 

[0047] The term "mammal" refers preferably to such organisms as mice, rats, 
rabbits, guinea pigs, sheep, and goats, more preferably to cats, dogs, monkeys, and apes, 
and most preferably to humans. 

[0048] In yet other preferred embodiments, the nucleic acid is a conserved or 
unique region, for example those useful for: the design of hybridization probes to facilitate 
identification and cloning of additional polypeptides, the design of PCR probes to facilitate 
cloning of additional polypeptides, obtaining antibodies to polypeptide regions, and 
designing antisense oligonucleotides. 

[0049] By "conserved nucleic acid regions", are meant regions present on two or 
more nucleic acids encoding a kinase polypeptide, to which a particular nucleic acid 
sequence can hybridize under lower stringency conditions. Examples of lower stringency 
conditions suitable for screening for nucleic acid encoding kinase polypeptides are 
provided in Abe, et al (J. Biol. Chem. 19:13361-13368, 1992), hereby incorporated by 
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reference herein in its entirety, including any drawings, figures, or tables. Preferably, 
conserved regions differ by no more than 5 out of 20 nucleotides, even more preferably 2 
out of 20 nucleotides or most preferably 1 out of 20 nucleotides. 

[0050] By "unique nucleic acid region" is meant a sequence present in a nucleic 
acid coding for a kinase polypeptide that is not present in a sequence coding for any other 
naturally occurring polypeptide. Such regions preferably encode 32 (preferably 40, more 
preferably 45, most preferably 55) or more contiguous amino acids set forth in the amino 
acid sequence of SEQ ED NO:5, SEQ ID NO:6, or SEQ ID NO:7, or the corresponding 
full-length amino acid sequence; 250 (preferably 255, more preferably 260, most 
preferably 270) or more contiguous amino acids set forth in the amino acid sequence SEQ 
ID NO:13, SEQ ID NO:14, or SEQ ID NO:15, or SEQ ID NO:105, or the corresponding 
full-length amino acid sequence; 27 (preferably 30, more preferably 40, most preferably 
45) or more contiguous amino acids set forth in the amino acid sequence SEQ ID NO: 18; 
16 (preferably 20, more preferably 25, most preferably 35) or more contiguous amino acids 
set forth in the amino acid sequence SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ 
ID NO:29, SEQ ID NO:31, or SEQ ID NO: 103, or the corresponding full-length amino 
acid sequence; 6 (preferably 10, more preferably 15, most preferably 25) or more 
contiguous amino acids set forth in the amino acid sequence of SEQ ID NO:97 or SEQ ID 
NO: 99, 22 (preferably 30, more preferably 35, most preferably 45) or more contiguous 
amino acids set forth in the amino acid sequence of SEQ ID NO: 101, or the corresponding 
full-length amino acid sequence; or 78 (preferably 80, more preferably 85, most preferably 
90) or more contiguous amino acids set forth in the amino acid sequence SEQ ID NO: 107, 
or functional derivatives thereof. In particular, a unique nucleic acid region is preferably of 
mammalian origin. 

[0051] A second aspect of the invention features a nucleic acid probe for the 
detection of nucleic acid encoding a kinase polypeptide in a sample, wherein said 
polypeptide is selected from the group consisting of STLK2, STLK3, STLK4, STLK5, 
STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, and 
PAK5. Preferably, the nucleic acid probe encodes a kinase polypeptide that is a fragment 
of the protein encoded by the amino acid sequence set forth in SEQ ID NO:5, SEQ ID 
NO:6, SEQ ID NO:7, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:18, 
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SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 
NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ID 
NO: 107, or the corresponding full-length amino acid sequences, not to include fragments 
consisting only of amino acids 1-22 of SEQ ID NO: 13 or amino acids 1-33 of SEQ ID 
NO: 107. The nucleic acid probe contains a nucleotide base sequence that will hybridize to 
a sequence set forth in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:9, SEQ 
ID NO: 10, SEQ ID NO: 1 1, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID 
NO:21, SEQ ID NO:27, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO: 100, SEQ ID 
NO: 102, SEQ ID NO: 104, or SEQ ID NO: 106, or the corresponding full-length sequence, 
or a functional derivative thereof. 

[0052] In preferred embodiments, the nucleic acid probe hybridizes to nucleic acid 
encoding at least 6, 12, 75, 90, 105, 120, 150, 200, 250, 300 or 350 contiguous amino 
acids of the sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID 
NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:23, 
SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31 SEQ ID NO:97, SEQ ID NO:99, SEQ ID 
NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ID NO:107, or the corresponding 
full-length amino acid sequence, or functional derivatives thereof. 

[0053] Methods for using the probes include detecting the presence or amount of 
kinase RNA in a sample by contacting the sample with a nucleic acid probe under 
conditions such that hybridization occurs and detecting the presence or amount of the 
probe bound to kinase RNA. The nucleic acid duplex formed between the probe and a 
nucleic acid sequence coding for a kinase polypeptide may be used in the identification of 
the sequence of the nucleic acid detected (Nelson et al. 9 in Nonisotopic DNA Probe 
Techniques, Academic Press, San Diego, Kricka, ed., p. 275, 1992, hereby incorporated by 
reference herein in its entirety, including any drawings, figures, or tables). Kits for 
performing such methods may be constructed to include a container means having 
disposed therein a nucleic acid probe. 

[0054] In a third aspect, the invention describes a recombinant cell or tissue 
comprising a nucleic acid molecule encoding a kinase polypeptide selected from the group 
consisting of STLK2, STLK3, STLK4, STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, 
KHS2, SULU1, SULU3, GEK2, PAK4, and PAK5. In such cells, the nucleic acid may be 
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under the control of the genomic regulatory elements, or may be under the control of 
exogenous regulatory elements including an exogenous promoter. By "exogenous" it is 
meant a promoter that is not normally coupled in vivo transcriptionally to the coding 
sequence for the kinase polypeptides. 

[0055] The polypeptide is preferably a fragment of the protein encoded by the 
amino acid sequence set forth in SEQ ID NO:5 5 SEQ ID NO:6, SEQ ID NO:7, SEQ ID 
NO:13, SEQ ED NO:14, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:23, 
SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, SEQ ID 
NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ID NO:107, or the corresponding 
full-length amino acid sequence, not to include fragments consisting only of amino acids 
1-22 of SEQ ID NO:13 or amino acids 1-33 of SEQ ID NO:107. By "fragment," is meant 
an amino acid sequence present in a kinase polypeptide. Preferably, such a sequence 
comprises at least 32, 45, 50, 60, 100, 200, or 300 contiguous amino acids of SEQ ID 
NO:5, SEQ ID NO:6, or SEQ ID NO:7, or of the corresponding full-length amino acid 
sequence; at least 250, 255, 275, 300, or 400 contiguous amino acids of SEQ ID NO:13, 
SEQ ID NO:14, SEQ ID NO:15, OR SEQ ID NO:105, or of the corresponding full-length 
amino acid sequence; at least 27, 30, 35, 40, 50, 100, 200, or 300 contiguous amino acids 
of SEQ ID NO: 18; at least 16, 25, 35, 50, 100, 200, or 300 contiguous amino acids of SEQ 
ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31 or SEQ ID 
NO:103, or of the corresponding full-length amino acid sequence; 6 (preferably 10, more 
preferably 15, most preferably 25) or more contiguous amino acids set forth in the amino 
acid sequence of SEQ ID NO:97 or SEQ ID NO:99, 22 (preferably 30, more preferably 35, 
most preferably 45) or more contiguous amino acids set forth in the amino acid sequence 
of SEQ ID NO:101; at least 78, 85, 90, 100, 200, or 300 contiguous amino acids of SEQ 
ID NO: 107, or the corresponding full-length amino acid sequence; or a functional 
derivative thereof. 

[0056] In a fourth aspect, the invention features an isolated, enriched, or purified 
kinase polypeptide selected from the group consisting of STLK2, STLK3, STLK4, 
STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, 
and PAK5. 

[0057] By "isolated" in reference to a polypeptide is meant a polymer of amino 
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acids (2 or more amino acids) conjugated to each other, including polypeptides that are 
isolated from a natural source or that are synthesized. The isolated polypeptides of the 
present invention are unique in the sense that they are not found in a pure or separated state 
in nature. Use of the term "isolated" indicates that a naturally occurring sequence has been 
removed from its normal cellular environment. Thus, the sequence may be in a cell-free 
solution or placed in a different cellular environment. The term does not imply that the 
sequence is the only amino acid chain present, but that it is essentially free (about 90 - 
95% pure at least) of non-amino acid material naturally associated with it. 

[0058] By the use of the term "enriched" in reference to a polypeptide is meant that 
the specific amino acid sequence constitutes a significantly higher fraction (2-5 fold) of 
the total amino acid sequences present in the cells or solution of interest than in normal or 
diseased cells or in the cells from which the sequence was taken. This could be caused by 
a person by preferential reduction in the amount of other amino acid sequences present, or 
by a preferential increase in the amount of the specific amino acid sequence of interest, or 
by a combination of the two. However, it should be noted that enriched does not imply 
that there are no other amino acid sequences present, just that the relative amount of the 
sequence of interest has been significantly increased. The term significant here is used to 
indicate that the level of increase is useful to the person making such an increase, and 
generally means an increase relative to other amino acid sequences of about at least 2-fold, 
more preferably at least 5- to 10-fold or even more. The term also does not imply that 
there is no amino acid sequence from other sources. The other source of amino acid 
sequences may, for example, comprise amino acid sequence encoded by a yeast or 
bacterial genome, or a cloning vector such as pUC19. The term is meant to cover only 
those situations in which man has intervened to increase the proportion of the desired 
amino acid sequence. 

[0059] It is also advantageous for some purposes that an amino acid sequence be in 
purified form. The term "purified" in reference to a polypeptide does not require absolute 
purity (such as a homogeneous preparation); instead, it represents an indication that the 
sequence is relatively purer than in the natural environment. Compared to the natural level 
this level should be at least 2-5 fold greater (e.g., in terms of mg/mL). Purification of at 
least one order of magnitude, preferably two or three orders, and more preferably four or 
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five orders of magnitude is expressly contemplated. The substance is preferably free of 
contamination at a functionally significant level, for example 90%, 95%, or 99% pure. 

[0060] In preferred embodiments, the kinase polypeptide is a fragment of the 
protein encoded by the amino acid sequence set forth in SEQ ID NO:5, SEQ ED NO:6, 
SEQ ID NO:7, SEQ ED NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID 
NO:22, SEQ ID NO:23, SEQ ED NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, 
SEQ ID NO:99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, or SEQ ID NO: 107, 
or the corresponding full-length amino acid sequences, not to include fragments consisting 
only of amino acids 1-22 of SEQ ID NO:13 or amino acids 1-33 of SEQ ID NO:107. 
Preferably, the kinase polypeptide contains at least 32, 45, 50, 60, 100, 200, or 300 
contiguous amino acids of SEQ ID NO:5, SEQ ID NO:6, or SEQ ED NO:7, or the 
corresponding full-length amino acid sequence; at least 250, 255, 275, 300, or 400 
contiguous amino acids of SEQ ED NO:13, SEQ ID NO:14, SEQ ID NO:15, or SEQ ID 
NO: 105, or the corresponding full-length amino acid sequence; at least 27, 30, 35, 40, 50, 
100, 200, or 300 contiguous amino acids of SEQ ID NO: 18; at least 16, 25, 35, 50, 100, 
200, or 300 contiguous amino acids of SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO:29, SEQ ID NO:31, or SEQ ID NO:103, or the corresponding full-length 
amino acid sequence; 6 (preferably 10, more preferably 15, most preferably 25) or more 
contiguous amino acids set forth in the amino acid sequence of SEQ ID NO:97 or SEQ ED 
NO:99, 22 (preferably 30, more preferably 35, most preferably 45) or more contiguous 
amino acids set forth in the amino acid sequence of SEQ ED NO: 101, or the corresponding 
full-length amino acid sequence; or at least 78, 85, 90, 100, 200, or 300 contiguous amino 
acids of SEQ ID NO: 107, or a functional derivative thereof. 

[0061] In preferred embodiments, the kinase polypeptide comprises an amino acid 
sequence having (a) the amino acid sequence set forth in SEQ ED NO:5, SEQ ED NO:6, 
SEQ ED NO:7, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ED NO: 18, SEQ ID 
NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, 
SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ED NO:107; 
(b) the amino acid sequence set forth in SEQ ID NO:5, SEQ ED NO:6, SEQ ED NO:7, 
SEQ ED NO: 13, SEQ ED NO: 14, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ED NO:22, SEQ ID 
NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, 
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SEQ ID NO:103, SEQ ID NO:105, or SEQ ID NO:107, except that it lacks one or more, 
but not all, of the following segments of amino acid residues: 1-21, 22-274, or 275-416 of 
SEQ ID NO:5, 1-31, 32-308, 309-489 or 490-516 of SEQ ID NO:6, 1-178 or 179-414 of 
SEQ ED NO:7, 1-22, 23-289, 290-526, 527-640, 641-896, or 897-1239 of SEQ ED NO:13, 
1-255, 256-442, 443-626, 627-954, or 955-1297 of SEQ ED NO:14, 1-255, 256-476, 477- 
680, 681-983, or 984-1326 of SEQ ED NO: 15, 1-13, 14-273, 274-346, 347-534, or 535- 
894 of SEQ ED NO: 18, 1-21, 22-277, 278-427, 428-637, 638-751, or 752-898 of SEQ ED 
NO:22, 1-66, 67-215, 216-425, 426-539, 540-786, or 787-887 of SEQ ED NO:23, 1-25, 26- 
273, 274-422, 423-632, or 633-748 of SEQ ED NO:24, 1-51, 52-224, 225-393, 394-658, or 
659-681 of SEQ ED NO:29, 1-25, 26-281, 282-430, 431-640, 641-754, 755-901, or 902- 
1001 of SEQEDNO:31, 1-10, 1 1-321, or 322-373 of SEQ ED NO:97, 1-57,58-369, or 
370-418 of SEQ ED NO:99, 1-52, 53-173, 174-307, 308-572, or 573-591 of SEQ ED 
NO: 103, 1-24, 25-289, 290-397, 398-628, 629-668, 669-872, or 873-1227 of SEQ ED 
NO:105, or 1-33, 34-294, 295-337, 338-472, 473-724, or 725-968 of SEQ ED NO:107; (c) 
the amino acid sequence set forth in SEQ ED NO:5, SEQ ED NO:6, SEQ ED NO:7, SEQ ED 
NO: 13, SEQ ED NO: 14, SEQ ED NO: 15, SEQ ED NO: 18, SEQ ED NO:22, SEQ ED NO:23, 
SEQ ED NO:24, SEQ ED NO:29, SEQ ED NO:97, SEQ ED NO:99, SEQ ED NO: 103, SEQ 
ED NO:105, or SEQ ED NO:107 from amino acid residues 1-21, 22-274, or 275-416 of 
SEQ ED NO:5, 1-31, 32-308, 309-489, or 490-516 of SEQ ED NO:6, 1-178 or 179-414 of 
SEQ ED NO:7, 23-289, 290-526, 527-640, 641-896, or 897-1239 of SEQ ED NO:13, 1-255, 
256-442, 443-626, 627-954, or 955-1297 of SEQ ED NO:14, 1-255, 256-476, 477-680, 
681-983, or 984-1326 of SEQ ED NO:15, 1-13, 14-273, 274-346, 347-534, or 535-894 of 
SEQ ED NO: 18, 1-21, 22-277, 278-427, 428-637, 638-751, or 752-898 of SEQ ED NO:22, 
1-66, 67-215, 216-425, 426-539, 540-786, or 787-887 of SEQ ED NO:23, 1-25, 26-273, 
274-422, 423-632, or 633-748 of SEQ ED NO:24, 1-51, 52-224, 225-393, 394-658, or 659- 
681 of SEQ ED NO:29, 1-25, 26-273, 274-422, 423-632, 633-746, 747-993, or 994-1093 
ofSEQEDNO:31, 1-10, 11-321, or 322-373 of SEQ ED NO:97, 1-57, 58-369, or 370-418 
of SEQ ED NO:99, 1-52, 53-173, 174-307, 308-572, or 573-591 of SEQ ED NO:103, 1-24, 
25-289, 290-397, 398-628, 629-668, 669-872, or 873-1227 of SEQ ED NO: 105, or 1-33, 
34-294, 295-337, 338-472, 473-724, or 725-968 of SEQ ED NO:107; or (d) the amino acid 
sequence set forth in SEQ ED NO:5, SEQ ED NO:6, SEQ ED NO:7, SEQ ED NO: 13, SEQ 
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ED NO: 14, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID 
NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID NO:99, SEQ ID 
NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ID NO:107, except that it lacks one 
or more, but not all, of the domains selected from the group consisting of a C-terminal 
domain, a catalytic domain, an N-terminal domain, a spacer region, a proline-rich region, a 
coiled-coil structure region, an insert, and a C-terminal tail. 

[0062] The polypeptide can be isolated from a natural source by methods well- 
known in the art. The natural source may be mammalian, preferably human, blood, semen, 
or tissue, and the polypeptide may be synthesized using an automated polypeptide 
synthesizer. The isolated, enriched, or purified kinase polypeptide is preferably: a STLK2, 
STLK3, STLK4, STLK5, STLK6, or STLK7 polypeptide; a ZC1, ZC2, ZC3, or ZC4 
polypeptide; a KHS2 polypeptide; a SULU1 or SULU3 polypeptide; a GEK2 polypeptide; 
or a PAK4 or PAK5 polypeptide. 

[0063] In some embodiments the invention includes a recombinant kinase 
polypeptide selected from the group consisting of STLK2, STLK3, STLK4, STLK5, 
STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, and 
PAK5. By "recombinant kinase polypeptide" is meant a polypeptide produced by 
recombinant DNA techniques such that it is distinct from a naturally occurring polypeptide 
either in its location (e.g., present in a different cell or tissue than found in nature), purity 
or structure. Generally, such a recombinant polypeptide will be present in a cell in an 
amount different from that normally observed in nature. 

[0064] In a fifth aspect, the invention features an antibody (e.g., a monoclonal or 
polyclonal antibody) having specific binding affinity to a kinase polypeptide or a kinase 
polypeptide domain or fragment where the polypeptide is selected from the group 
consisting of STLK2, STLK3, STLK4, STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, 
KHS2, SULU1, SULU3, GEK2, PAK4, and PAK5. By "specific binding affinity" is 
meant that the antibody binds to the target kinase polypeptide with greater affinity than it 
binds to other polypeptides under specified conditions. Antibodies or antibody fragments 
are polypeptides that contain regions that can bind other polypeptides. The term "specific 
binding affinity" describes an antibody that binds to a kinase polypeptide with greater 
affinity than it binds to other polypeptides under specified conditions. 
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[0065] The term "polyclonal" refers to antibodies that are heterogenous 
populations of antibody molecules derived from the sera of animals immunized with an 
antigen or an antigenic functional derivative thereof For the production of polyclonal 
antibodies, various host animals may be immunized by injection with the antigen. Various 
adjuvants may be used to increase the immunological response, depending on the host 
species. 

[0066] "Monoclonal antibodies" are substantially homogenous populations of 
antibodies to a particular antigen. They may be obtained by any technique which provides 
for the production of antibody molecules by continuous cell lines in culture. Monoclonal 
antibodies may be obtained by methods known to those skilled in the art (Kohler et al., 
Nature 256:495-497, 1975, and U.S. Patent No. 4,376,1 10, both of which are hereby 
incorporated by reference herein in their entirety including any figures, tables, or 
drawings). 

[0067] The term "antibody fragment" refers to a portion of an antibody, often the 
hyper variable region and portions of the surrounding heavy and light chains, that displays 
specific binding affinity for a particular molecule. A hyper variable region is a portion of 
an antibody that physically binds to the polypeptide target. 

[0068] Antibodies or antibody fragments having specific binding affinity to a 
kinase polypeptide of the invention may be used in methods for detecting the presence 
and/or amount of kinase polypeptide in a sample by probing the sample with the antibody 
under conditions suitable for kinase-antibody immunocomplex formation and detecting the 
presence and/or amount of the antibody conjugated to the kinase polypeptide. Diagnostic 
kits for performing such methods may be constructed to include antibodies or antibody 
fragments specific for the kinase as well as a conjugate of a binding partner of the 
antibodies or the antibodies themselves. 

[0069] An antibody or antibody fragment with specific binding affinity to a kinase 
polypeptide of the invention can be isolated, enriched, or purified from a prokaryotic or 
eukaryotic organism. Routine methods known to those skilled in the art enable production 
of antibodies or antibody fragments, in both prokaryotic and eukaryotic organisms. 
Purification, enrichment, and isolation of antibodies, which are polypeptide molecules, are 
described above. 
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[0070] Antibodies having specific binding affinity to a kinase polypeptide of the 
invention may be used in methods for detecting the presence and/or amount of kinase 
polypeptide in a sample by contacting the sample with the antibody under conditions such 
that an immunocomplex forms and detecting the presence and/or amount of the antibody 
conjugated to the kinase polypeptide. Diagnostic kits for performing such methods may be 
constructed to include a first container containing the antibody and a second container 
having a conjugate of a binding partner of the antibody and a label, such as, for example, a 
radioisotope. The diagnostic kit may also include notification of an FDA approved use 
and instructions therefor. 

[0071] In a sixth aspect, the invention features a hybridoma which produces an 
antibody having specific binding affinity to a kinase polypeptide or a kinase polypeptide 
domain, where the polypeptide is selected from the group consisting of STLK2, STLK3, 
STLK4, STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, 
PAK4, and PAK5. By "hybridoma" is meant an immortalized cell line that is capable of 
secreting an antibody, for example an antibody to a kinase of the invention. In preferred 
embodiments, the antibody to the kinase comprises a sequence of amino acids that is able 
to specifically bind a kinase polypeptide of the invention. 

[0072] In a seventh aspect, the invention features a kinase polypeptide binding 
agent able to bind to a kinase polypeptide selected from the group consisting of STLK2, 
STLK3, STLK4, STLK6, STLK7, STLK5, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, 
GEK2, PAK4, and PAK5. The binding agent is preferably a purified antibody that 
recognizes an epitope present on a kinase polypeptide of the invention. Other binding 
agents include molecules that bind to kinase polypeptides and analogous molecules that 
bind to a kinase polypeptide. Such binding agents may be identified by using assays that 
measure kinase binding partner activity, such as those that measure PDGFR activity. 

[0073] The invention also features a method for screening for human cells 
containing a kinase polypeptide of the invention or an equivalent sequence. The method 
involves identifying the novel polypeptide in human cells using techniques that are routine 
and standard in the art, such as those described herein for identifying the kinases of the 
invention (e.g., cloning, Southern or Northern blot analysis, in situ hybridization, PCR 
amplification, etc.). 
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[0074] In an eighth aspect, the invention features methods for identifying a 
substance that modulates kinase activity comprising the steps of: (a) contacting a kinase 
polypeptide selected from the group consisting of STLK2, STLK3, STLK4, STLK5, 
STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, and 
PAK5 with a test substance; (b) measuring the activity of said polypeptide; and (c) 
determining whether said substance modulates the activity of said polypeptide. 

[0075] The term "modulates" refers to the ability of a compound to alter the 
function of a kinase of the invention. A modulator preferably activates or inhibits the 
activity of a kinase of the invention depending on the concentration of the compound 
exposed to the kinase. 

[0076] The term "activates" refers to increasing the cellular activity of the kinase. 
The term inhibit refers to decreasing the cellular activity of the kinase. Kinase activity is 
preferably the interaction with a natural binding partner. 

[0077] The term "modulates" also refers to altering the function of kinases of the 
invention by increasing or decreasing the probability that a complex forms between the 
kinase and a natural binding partner. A modulator preferably increases the probability that 
such a complex forms between the kinase and the natural binding partner, more preferably 
increases or decreases the probability that a complex forms between the kinase and the 
natural binding partner depending on the concentration of the compound exposed to the 
kinase, and most preferably decreases the probability that a complex forms between the 
kinase and the natural binding partner. 

[0078] The term "complex" refers to an assembly of at least two molecules bound 
to one another. Signal transduction complexes often contain at least two protein molecules 
bound to one another. For instance, a protein tyrosine receptor protein kinase, GRB2, 
SOS, RAF, and RAS assemble to form a signal transduction complex in response to a 
mitogenic ligand. 

[0079] The term "natural binding partner" refers to polypeptides, lipids, small 
molecules, or nucleic acids that bind to kinases in cells. A change in the interaction 
between a kinase and a natural binding partner can manifest itself as an increased or 
decreased probability that the interaction forms, or an increased or decreased concentration 
of kinase/natural binding partner complex. 
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[0080] The term "contacting" as used herein refers to mixing a solution comprising 
the test compound with a liquid medium bathing the cells of the methods. The solution 
comprising the compound may also comprise another component, such as dimethyl 
sulfoxide (DMSO), which facilitates the uptake of the test compound or compounds into 
the cells of the methods. The solution comprising the test compound may be added to the 
medium bathing the cells by utilizing a delivery apparatus, such as a pipet-based device or 
syringe-based device. 

[0081] In a ninth aspect, the invention features methods for identifying a substance 
that modulates kinase activity in a cell comprising the steps of: (a) expressing a kinase 
polypeptide in a cell, wherein said polypeptide is selected from the group consisting of 
STLK2, STLK3, STLK4, STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, 
SULU3, GEK2, PAK4, and PAK5; (b) adding a test substance to said cell; and (c) 
monitoring a change in cell phenotype or the interaction between said polypeptide and a 
natural binding partner. 

[0082] The term "expressing" as used herein refers to the production of kinases of 
the invention from a nucleic acid vector containing kinase genes within a cell. The nucleic 
acid vector is transfected into cells using well known techniques in the art as described 
herein. 

[0083] In a tenth aspect, the invention provides methods for treating a disease by 
administering to a patient in need of such treatment a substance that modulates the activity 
of a kinase selected from the group consisting of STLK2, STLK3, STLK4, STLK5, 
STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, and 
PAK5. Preferably, the disease is selected from the group consisting of immune-related 
diseases and disorders, organ transplantation, myocardial infarction, cardiovascular 
disease, stroke, renal failure, oxidative stress-related neurodegenerative disorders, and 
cancer. Most preferably, the immune-related diseases and disorders include, but are not 
limited to, rheumatoid arthritis, artherosclerosis, and autoimmune disorders. 

[0084] In preferred embodiments, the invention provides methods for treating or 
preventing a disease or disorder by administering to a patient in need of such treatment a 
substance that modulates the activity of a kinase polypeptide selected from the group 
consisting of ZC1, ZC2, ZC3, ZC4, KHS2, PAK4, and PAK5. Preferably, the disease or 
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disorder is selected from the group consisting of rheumatoid arthritis, artherosclerosis, 
autoimmune disorders, and organ transplantation. The invention also features methods of 
treating or preventing a disease or disorder by administering to a patient in need of such 
treatment a substance that modulates the activity of a kinase polypeptide selected from the 
group consisting of STLK1, STLK2, STLK3, STLK4, STLK5, STLK6, and STLK7. 
Preferably the disease or disorder is selected from the group consisting of immune-related 
diseases and disorders, myocardial infarction, cardiomyopathies, stroke, renal failure, and 
oxidative stress-related neurodegenerative disorders. Most preferably, the immune-related 
diseases and disorders are selected from the group consisting of rheumatoid arthritis, 
chronic inflammatory bowel disease, chronic inflammatory pelvic disease, multiple 
sclerosis, asthma, osteoarthritis, psoriasis, atherosclerosis, rhinitis, autoimmunity, and 
organ transplantation. 

[0085] The invention also features methods of treating or preventing a disease or 
disorder by administering to a patient in need of such treatment a substance that modulates 
the activity of a kinase polypeptide selected from the group consisting of ZC1, ZC2, ZC3, 
and ZC4. Preferably the disease is selected from the group consisting of immune-related 
diseases and disorders, cardiovascular disease, and cancer. Most preferably, the immune- 
related diseases and disorders are selected from the group consisting of rheumatoid 
arthritis, chronic inflammatory bowel disease, chronic inflammatory pelvic disease, 
multiple sclerosis, asthma, osteoarthritis, psoriasis, atherosclerosis, rhinitis, autoimmunity, 
and organ transplantation. 

[0086] Substances useful for treatment of kinase-related disorders or diseases 
preferably show positive results in one or more in vitro assays for an activity 
corresponding to treatment of the disease or disorder in question (Examples of such assays 
are provided in the references in section VI, below; and in Example 7, herein). Examples 
of substances that can be screened for favorable activity are provided and referenced in 
section VI, below. The substances that modulate the activity of the kinases preferably 
include, but are not limited to, antisense oligonucleotides and inhibitors of protein kinases, 
as determined by methods and screens referenced in section VI and Example 7, below. 

[0087] The term "preventing" refers to decreasing the probability that an organism 
contracts or develops an abnormal condition. 
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[0088] The term "treating" refers to having a therapeutic effect and at least 
partially alleviating or abrogating an abnormal condition in the organism. 

[0089] The term "therapeutic effect" refers to the inhibition or activation factors 
causing or contributing to the abnormal condition. A therapeutic effect relieves to some 
extent one or more of the symptoms of the abnormal condition. In reference to the 
treatment of abnormal conditions, a therapeutic effect can refer to one or more of the 
following: (a) an increase in the proliferation, growth, and/or differentiation of cells; (b) 
inhibition (i.e., slowing or stopping) of cell death; (c) inhibition of degeneration; (d) 
relieving to some extent one or more of the symptoms associated with the abnormal 
condition; and (e) enhancing the function of the affected population of cells. Compounds 
demonstrating efficacy against abnormal conditions can be identified as described herein. 

[0090] The term "abnormal condition" refers to a function in the cells or tissues of 
an organism that deviates from their normal functions in that organism. An abnormal 
condition can relate to cell proliferation, cell differentiation, or cell survival. 

[0091] Abnormal cell proliferative conditions include cancers such as fibrotic and 
mesangial disorders, abnormal angiogenesis and vasculogenesis, wound healing, psoriasis, 
diabetes mellitus, and inflammation. 

[0092] Abnormal differentiation conditions include, but are not limited to 
neurodegenerative disorders, slow wound healing rates, and slow tissue grafting healing 
rates. 

[0093] Abnormal cell survival conditions relate to conditions in which 
programmed cell death (apoptosis) pathways are activated or abrogated. A number of 
protein kinases are associated with the apoptosis pathways. Aberrations in the function of 
any one of the protein kinases could lead to cell immortality or premature cell death. 

[0094] The term "aberration", in conjunction with the function of a kinase in a 
signal transduction process, refers to a kinase that is over- or under-expressed in an 
organism, mutated such that its catalytic activity is lower or higher than wild-type protein 
kinase activity, mutated such that it can no longer interact with a natural binding partner, is 
no longer modified by another protein kinase or protein phosphatase, or no longer interacts 
with a natural binding partner. 

[0095] The term "administering" relates to a method of incorporating a compound 
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into cells or tissues of an organism. The abnormal condition can be prevented or treated 
when the cells or tissues of the organism exist within the organism or outside of the 
organism. Cells existing outside the organism can be maintained or grown in cell culture 
dishes. For cells harbored within the organism, many techniques exist in the art to 
administer compounds, including (but not limited to) oral, parenteral, dermal, injection, 
and aerosol applications. For cells outside of the organism, multiple techniques exist in 
the art to administer the compounds, including (but not limited to) cell microinjection 
techniques, transformation techniques, and carrier techniques. 

[0096] The abnormal condition can also be prevented or treated by administering a 
compound to a group of cells having an aberration in a signal transduction pathway to an 
organism. The effect of administering a compound on organism function can then be 
monitored. The organism is preferably a mouse, rat, rabbit, guinea pig, or goat, more 
preferably a monkey or ape, and most preferably a human. 

[0097] In an eleventh aspect, the invention features methods for detection of a 
kinase polypeptide in a sample as a diagnostic tool for diseases or disorders, wherein the 
method comprises the steps of: (a) contacting the sample with a nucleic acid probe which 
hybridizes under hybridization assay conditions to a nucleic acid target region of a kinase 
polypeptide selected from the group consisting of STLK2, STLK3, STLK4, STLK5, 
STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, GEK2, PAK4, and 
PAK5, said probe comprising the nucleic acid sequence encoding the polypeptide, 
fragments thereof, and the complements of the sequences and fragments; and (b) detecting 
the presence or amount of the probe:target region hybrid as an indication of the disease. 

[0098] In preferred embodiments of the invention, the disease or disorder is 
selected from the group consisting of rheumatoid arthritis, artherosclerosis, autoimmune 
disorders, organ transplantation, myocardial infarction, cardiomyopathies, stroke, renal 
failure, oxidative stress-related neurodegenerative disorders, and cancer. In other preferred 
embodiments, the kinase polypeptide is selected from the group consisting of PAK4 and 
PAK5, or the polypeptide is selected from the group consisting of ZC1, ZC2, ZC3, and 
ZC4, and the disease is cancer. 

[0099] The kinase "target region" is the nucleotide base sequence set forth in SEQ 
ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:ll, 
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SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:27, SEQ ID 
NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID NO:102, SEQ ID NO:104, or SEQ ID 
NO: 106, or the corresponding full-length sequences, a functional derivative thereof, or a 
fragment thereof to which the nucleic acid probe will specifically hybridize. Specific 
hybridization indicates that in the presence of other nucleic acids the probe only hybridizes 
detectably with the kinase of the invention's target region. Putative target regions can be 
identified by methods well known in the art consisting of alignment and comparison of the 
most closely related sequences in the database. 

[0100] In preferred embodiments the nucleic acid probe hybridizes to a kinase 
target region encoding at least 6, 12, 75, 90, 105, 120, 150, 200, 250, 300 or 350 
contiguous amino acids of the sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID 
NO:7, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:97, SEQ ID 
NO:99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, or SEQ ID NO: 107, or the 
corresponding full-length amino acid sequence, or a functional derivative thereof. 
Hybridization conditions should be such that hybridization occurs only with the kinase 
genes in the presence of other nucleic acid molecules. Under stringent hybridization 
conditions only highly complementary nucleic acid sequences hybridize. Preferably, such 
conditions prevent hybridization of nucleic acids having more than 1 or 2 mismatches out 
of 20 contiguous nucleotides. Such conditions are defined supra. 

[0101] The diseases for which detection of kinase genes in a sample could be 
diagnostic include diseases in which kinase nucleic acid (DNA and/or RNA) is amplified 
in comparison to normal cells. By "amplification" is meant increased numbers of kinase 
DNA or RNA in a cell compared with normal cells. In normal cells, kinases are typically 
found as single copy genes. In selected diseases, the chromosomal location of the kinase 
genes may be amplified, resulting in multiple copies of the gene, or amplification. Gene 
amplification can lead to amplification of kinase RNA, or kinase RNA can be amplified in 
the absence of kinase DNA amplification. 

[0102] "Amplification" as it refers to RNA can be the detectable presence of kinase 
RNA in cells, since in some normal cells there is no basal expression of kinase RNA. In 
other normal cells, a basal level of expression of kinase exists, therefore in these cases 
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amplification is the detection of at least 1-2-fold, and preferably more, kinase RNA, 
compared to the basal level. 

[0103] The diseases that could be diagnosed by detection of kinase nucleic acid in 
a sample preferably include cancers. The test samples suitable for nucleic acid probing 
methods of the present invention include, for example, cells or nucleic acid extracts of 
cells, or biological fluids. The samples used in the above-described methods will vary 
based on the assay format, the detection method and the nature of the tissues, cells or 
extracts to be assayed. Methods for preparing nucleic acid extracts of cells are well known 
in the art and can be readily adapted in order to obtain a sample that is compatible with the 
method utilized. 

[0104] In a final aspect, the invention features a method for detection of a kinase 
polypeptide in a sample as a diagnostic tool for a disease or disorder, wherein the method 
comprises: (a) comparing a nucleic acid target region encoding the kinase polypeptide in a 
sample, where the kinase polypeptide is selected from the group consisting of STLK2, 
STLK3, STLK4, STLK5, STLK6, STLK7, ZC1, ZC2, ZC3, ZC4, KHS2, SULU1, SULU3, 
GEK2, PAK4, and PAK5, or one or more fragments thereof, with a control nucleic acid 
target region encoding the kinase polypeptide, or one or more fragments thereof; and (b) 
detecting differences in sequence or amount between the target region and the control 
target region, as an indication of the disease or disorder. Preferably, the disease or disorder 
is selected from the group consisting of immune-related diseases and disorders, organ 
transplantation, myocardial infarction, cardiovascular disease, stroke, renal failure, 
oxidative stress-related neurodegenerative disorders, and cancer. Immune-related diseases 
and disorders include, but are not limited to, those discussed previously. 

[0105] The term "comparing" as used herein refers to identifying discrepancies 
between the nucleic acid target region isolated from a sample, and the control nucleic acid 
target region. The discrepancies can be in the nucleotide sequences, e.g. insertions, 
deletions, or point mutations, or in the amount of a given nucleotide sequence. Methods to 
determine these discrepancies in sequences are well-known to one of ordinary skill in the 
art. The "control" nucleic acid target region refers to the sequence or amount of the 
sequence found in normal cells, e.g. cells that are not diseased as discussed previously. 

[0106] The invention has been described broadly and generically herein. Each of 
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the narrower species and subgeneric groupings falling within the generic disclosure also 
form part of the invention. This includes the generic description of the invention with a 
proviso or negative limitation removing any subject matter from the genus, regardless of 
whether or not the excised material is specifically recited herein. For example, in some 
instances the nucleotide sequence of the ZC4 kinase polypeptide may not be part of a 
preferred embodiment. 

[0107] The summary of the invention described above is not limiting and other 
features and advantages of the invention will be apparent from the following detailed 
description of the invention, and from the claims. 

BRIEF DESCRIPTION OF THE FIGURES 

[0108] Figures 1 A, IB and 1C show a multiple sequence alignment of the amino 
acid sequences (SEQ ID NOS 84-85, 5-7, respectively, in order of appearance) of the 
STE20-STE20 family kinases. 

[0109] Figures 2 A and 2B show a multiple sequence alignment of the amino acid 
sequences (SEQ ID NOS 84, 86-87 & 8, respectively, in order of appearance) of the 
STE20-STLK5 family kinases. 

[0110] Figures 3 A, 3B, 3C, 3D, 3E, 3F and 3G show a multiple sequence 
alignment of the amino acid sequences (SEQ ID NOS 88-89, 13-16, respectively, in order 
of appearance) of STE20-ZC family kinases. 

[01 1 1] Figures 4A, 4B and 4C show a pairwise sequence (SEQ ID NOS 91 & 18, 
respectively, in order of appearance) alignment of STE20-KHS family kinases. 

[01 12] Figures 5A, 5B, 5C and 5D show a multiple sequence alignment of the 
amino acid sequences (SEQ ID NOS 90, 22, 24 & 151 respectively, in order of 
appearance) of STE20-SULU family kinases. 

[0113] Figures 6 A, 6B and 6C show a pairwise sequence (SEQ ID NOS 92 & 26, 
respectively, in order of appearance) alignment of STE20-GEK family kinases. 

[0114] Figures 7 A, 7B and 7C show a multiple sequence alignment of the amino 
acid sequences (SEQ ID NOS 93-95, 29-30 respectively, in order of appearance) of 
STE20-PAK family kinases. 

[0115] Figures 8A, 8B, 8C, 8D, 8E, 8F and 8G show the amino acid sequences of 
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human STLK2(SEQ ID NO:5), human STLK3(SEQ ID NO:6), human STLK4(SEQ ID 
NO:7), human STLK5(SEQ ID NO:8), human ZC1(SEQ ID NO: 13), human ZC2(SEQ ID 
NO:14), human ZC3(SEQ ID NO: 15), human ZC4(SEQ ID NO: 16), human KHS2(SEQ 
ID NO: 18), human SULU1(SEQ ID NO:22), human SULU3(SEQ ID NO:23), murine 
SULU3(SEQ ID NO:24), human GEK2(SEQ ID NO:26), human PAK4(SEQ ID NO:29), 
and human PAK5(SEQ ID NO30). 

[0116] Figures 9A, 9B, 9C, 9D, 9E, 9F, 9G, 9H, 91, 9J, 9K, 9L, 9M, 9N, 90, 9P, 
9Q, 9R, 9S, 9T, 9U and 9 V show the nucleic acid sequences of human STLK2(SEQ ID 
NO:l), human STLK3(SEQ ID NO:2), human STLK4(SEQ ID NO:3), human 
STLK5(SEQ ID NO:4), human ZC1(SEQ ID NO:9), human ZC2(SEQ ID NO: 10), human 
ZC3(SEQ ID NO:l 1), human ZC4(SEQ ED NO: 12), human KHS2(SEQ ID NO: 17), 
human SULU1(SEQ ID NO:19), human SULU3(SEQ ID NO:20), murine SULU3(SEQ ID 
NO:21), human GEK2(SEQ ID NO:25), human PAK4(SEQ ID NO:27), and human 
PAK5(SEQ ID NO:28). 

[0117] Figures 10A, 10B and 10C show the full-length amino acid sequences of 
human STLK5 (SEQ ED NO: 97), human PAK5 (SEQ ED NO: 103), and human ZC4 (SEQ 
ID NO: 105), as well as the partial amino acid sequences of human full-length STLK6 
(SEQ ED NO: 99) and human STLK7 (SEQ ED NO: 101) and human GEK2 (SEQ ED NO: 
107). 

[0118] Figures 1 1 A, 1 IB, 1 1C, 1 ID, 1 IE, 1 IF, 1 1G and 1 1H show the full-length 
nucleic acid sequences of human STLK5 (SEQ ID NO:96), human PAK5 (SEQ ED 
NO: 102), and human ZC4 (SEQ ED NO: 104), as well as the partial nucleic acid sequences 
of human STLK6 (SEQ ED NO: 98) and human STLK7 (SEQ ED NO: 100) and human 
GEK2 (SEQ ED NO: 106). 

[0119] Figures 12A and 12B show a multiple sequence alignment among human 
SPAK (SEQ ED NO: 153), human STLK6 (SEQ ED NO: 99), human STLK7 (SEQ ED NO: 
101) and full-length human STLK5 (SEQ ED NO: 152). 

[0120] Figures 13 A, 13B and 13C show a multiple sequence alignment among 
human PAK1 (SEQ ED NO: 93), human PAK4 (SEQ ED NO: 29) and human PAK5 (SEQ 
ED NO: 103). 

[0121] Figures 14 A, 14B and 14C show a pair-wise sequence alignment between 
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human ZC1 (SEQ ID NO: 15) and human ZC4 (SEQ ID NO: 105). 

[0122] Figures 15 A, 15B and 15C show a pair-wise sequence alignment between 
LOK1 (SEQ ID NO: 154) and full-length GEK2 (SEQ ID NO: 155). 

DETAILED DESCRIPTION OF THE INVENTION 

[0123] The present invention relates in part to kinase polypeptides, nucleic acids 
encoding such polypeptides, cells containing such nucleic acids, antibodies to such 
polypeptides, assays utilizing such polypeptides, and methods relating to all of the 
foregoing. The present invention is based upon the isolation and characterization of new 
kinase polypeptides. The polypeptides and nucleic acids may be produced using well- 
known and standard synthesis techniques when given the sequences presented herein. 

[0124] The recent elucidation of the DNA sequence of Saccharomyces cerevesiae 
has provided the first complete example of the genetic information contained in a simple 
eukaryotic organism. Analysis of this yeast genome revealed that it contains at least 113 
protein kinases. These kinases were further subdivided into several structurally related 
groups. One of these newly defined groups was termed the STE20-family to represent its 
founding member STE20, which is a protein kinase involved in the yeast pheromone 
response pathway that initiates a protein kinase cascade in response to a G-protein 
mediated signal. S. cerevesiae has two additional members of this family, CLA4, and 
YOL113W (HRA655). 

[0125] Several mammalian homologues have recently been identified that belong 
to the STE20-family, including SOK-1 (human STE20), GC-kinase, KHS, HPK1, NIK, 
SLK, GEK, PAK1, PAK65, MST1, and CDC7. Furthermore, the Drosophila and the C. 
elegans genome efforts have identified additional protein kinases which belong to the 
STE20-family, yet have structurally unique extracatalytic domains, including ZC504.4 and 
SULU kinases from C. elegans, and NINAC of Drosophila. 

[0126] STE20-related protein kinases have been implicated as regulating a variety 
of cellular responses, including response to growth factors or cytokines, oxidative-, UV-, 
or irradiation-related stress pathways, inflammatory signals {i.e., TNFD), apoptotic stimuli 
(i.e., Fas), T and B cell costimulation, the control of cytoskeletal architecture, and cellular 
transformation. Typically, the STE20-related kinases serve as upstream regulators of 
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MAPK cascades. Examples include: HPK1, a protein-serine/ threonine kinase (STK) that 
possesses a STE20-like kinase domain that activates a protein kinase pathway leading to 
the stress-activated protein kinase SAPK/JNK; PAK1, an STK with an upstream CDC42- 
binding domain that interacts with Rac and plays a role in cellular transformation through 
the Ras-MAPK pathway; and murine NIK, which interacts with upstream receptor tyrosine 
kinases and connects with downstream STE1 1 -family kinases. 

[0127] The STE20-kinases possess a variety of non-catalytic domains that are 
believed to interact with upstream regulators. Examples include proline-rich domains for 
interaction with SH3 -containing proteins, or specific domains for interaction with Rac, 
Rho, and Rab small G-proteins. These interactions may provide a mechanism for cross- 
talk between distinct biochemical pathways in response to external stimuli such as the 
activation of a variety of cell surface receptors, including tyrosine kinases, cytokine 
receptors, TNF receptor, Fas, T cell receptors, CD28, or CD40. 

I. The Nucleic Acids of the Invention 

[0128] Included within the scope of this invention are the functional equivalents of 
the herein-described isolated nucleic acid molecules. The degeneracy of the genetic code 
permits substitution of certain codons by other codons that specify the same amino acid 
and hence would give rise to the same protein. The nucleic acid sequence can vary 
substantially since, with the exception of methionine and tryptophan, the known amino 
acids can be coded for by more than one codon. Thus, portions or all of the kinase genes 
of the invention could be synthesized to give a nucleic acid sequence significantly different 
from that shown in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:9, SEQ ID 
NO: 10, SEQ ID NO:l 1, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:21, 
SEQ ID NO:27, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ 
ID NO: 104, and SEQ ID NO: 106. The encoded amino acid sequence thereof would, 
however, be preserved. 

[0129] In addition, the nucleic acid sequence may comprise a nucleotide sequence 
which results from the addition, deletion or substitution of at least one nucleotide to the 5'- 
end and/or the 3'-end of the nucleic acid formula shown in SEQ ID NO:l, SEQ ID NO:2, 
SEQ ID NO:3, SEQ ID NO:9, SEQ ID NO: 10, SEQ ID NO:l 1, SEQ ID NO: 17, SEQ ED 
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NO: 19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:27, SEQ ID NO:96, SEQ ID NO:98, 
SEQ ID NO:100, SEQ ID NO:102, SEQ ID NO:104, or SEQ ID NO:106, or a derivative 
thereof. Any nucleotide or polynucleotide may be used in this regard, provided that its 
addition, deletion or substitution does not alter the amino acid sequence of SEQ ID NO:5, 
SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO:15, SEQ ID 
NO: 18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:97, 
SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, or SEQ ID NO:107, 
which is encoded by the nucleotide sequence. For example, the present invention is 
intended to include any nucleic acid sequence resulting from the addition of ATG as an 
initiation codon at the 5 f -end of the inventive nucleic acid sequence or its derivative, or 
from the addition of TTA, TAG or TGA as a termination codon at the 3'-end of the 
inventive nucleotide sequence or its derivative. Moreover, the nucleic acid molecule of 
the present invention may, as necessary, have restriction endonuclease recognition sites 
added to its 5'-end and/or 3 f -end. 

[0130] Such functional alterations of a given nucleic acid sequence afford an 
opportunity to promote secretion and/or processing of heterologous proteins encoded by 
foreign nucleic acid sequences fused thereto. All variations of the nucleotide sequence of 
the kinase genes of the invention and fragments thereof permitted by the genetic code are, 
therefore, included in this invention. 

[0131] Further, it is possible to delete codons or to substitute one or more codons 
with codons other than degenerate codons to produce a structurally modified polypeptide, 
but one which has substantially the same utility or activity as the polypeptide produced by 
the unmodified nucleic acid molecule. As recognized in the art, the two polypeptides are 
functionally equivalent, as are the two nucleic acid molecules that give rise to their 
production, even though the differences between the nucleic acid molecules are not related 
to the degeneracy of the genetic code. 

Mammalian STLK2 

[0132] The full-length human STLK2 cDNA (SEQ ID NO:l) is 3268 bp long and 
consists of a 1248 bp open reading frame (ORF) flanked by a 181 bp 5' untranslated 
region (UTR; 1-181) and a 1784 bp 3' UTR (1433-3216) that is followed by a 52 
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nucleotide polyadenylated region. A polyadenylation signal (AATAAA) is found at 
positions (3193-3198). The sequence flanking the first ATG conforms to the Kozak 
consensus (Kozak, M., Nucleic Acids Res. 15, 8125-8148 (1987)) for an initiating 
methionine, and is believed to be the translational start site for STLK2. Furthermore, 
human STLK2 and the related SOK-1 and MST3 proteins conserve the amino acid 
sequence immediately following this presumed initiating methionine. 

[0133] Several EST fragments span the complete STLK2 sequence with 
AA191319 at the 5' end and W16504 at the 3' end. 

Mammalian STLK3 

[0134] The partial human STLK3 cDNA (SEQ ID NO:2) is 3030 bp long and 
consists of a 1548 bp ORF flanked by a 1476 bp 3' UTR (1550-3025) and a 5 nucleotide 
polyadenylated region. A potential polyadenylation signal (AATAAA) begins at position 
3004. Since the coding region is open throughout the 5' extent of this sequence, this is 
apparently a partial cDNA clone lacking the N-terminal start methionine. 

[0135] Multiple EST fragments span the complete STLK3 sequence with 
AA278967 at the 5' end and AA628477 and others at the 3' end. 

Mammalian STLK4 

[0136] The partial human STLK4 cDNA (SEQ ID NO:3) is 3857 bp long and 
consists of a 1242 bp ORE flanked by a 2596 bp 3' UTR (1244-3839) and an 18 nucleotide 
polyadenylated region. A potential polyadenylation signal (AATAAA) is found at 
positions 2181-3822. Since the coding region is open throughout the 5' extent of this 
sequence, this is apparently a partial cDNA clone lacking the N-terminal start methionine. 
A near full-length murine STLK4 cDNA is represented in the 1773 bp EST AA1 17438. It 
extends an additional 21 nucleotides 5' of the human STLK4 consensus, but since its 
coding region is open throughout the 5' extent of the sequence, this is also apparently a 
partial cDNA clone lacking the N-terminal start methionine. 

[0137] Several EST fragments span the complete STLK3 sequence with 
AA297759 at the 5' end and AA1 00484 and others at the 3' end. 
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Mammalian STLK5 

[0138] The full-length human STLK5 cDNA (SEQ ID NO:96) is 2110 bp long and 
consists of a 1 1 19 bp ORF flanked by a 229 bp 5' UTR and a 762 bp 3' UTR. The 
sequence flanking the first ATG conforms to the Kozak consensus (supra) for an initiating 
methionine, and is believed to be the translational start site for STLK5. Several EST 
fragments span the complete STLK5 sequence with AA297059 and F07734 at the 5' end, 
and R46686 and F03423 and others at the 3' end. 

Mammalian STLK6 

[0139] The full-length human STLK6 cDNA (SEQ ED NO:98) is 2,001 bp long 
and consists of a 1,254 bp ORF flanked by a 75 bp 5' UTR and a 673 bp 3' UTR. The 
sequence flanking the first ATG conforms to the Kozak consensus (supra) for an initiating 
methionine, and is believed to be the translational start site for STLK6. 

Mammalian STLK7 

[0140] The partial human STLK7 cDNA (SEQ ID NO: 100) is 3 1 1 bp long and 
consists of a 309 bp ORF. Since the coding region is open throughout both the 5' and 3' 
extent of this sequence, this is apparently a partial cDNA clone lacking the N-terminal 
start methionine and C-terminal stop codon. 

Mammalian ZC1 

[0141] The full-length human ZC1 cDNA (SEQ ID NO:9) is 3798 bp long and 
consists of a 3717 bp ORF (7-3723) flanked by a 6 bp 5' UTR and a 75 bp (3724-3798) 3' 
UTR. No polyadenylation signal (AATAAA) or polyadenylated region are present in the 
3 'UTR. The sequence flanking the first ATG conforms to the Kozak consensus for an 
initiating methionine, and is believed to be the translational start site for human ZC1. 

[0142] Multiple EST fragments (W81656) match the 3' end of the human ZC1 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

Mammalian ZC2 
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[0143] The partial human ZC2 cDNA (SEQ ID NO: 10) is 4055 bp long and 
consists of a 3891 bp ORF (1-3891) and a 164 bp (3892-4055) 3' UTR. Since the coding 
region is open throughout the 5' extent of this sequence, this is apparently a partial cDNA 
clone lacking the N-terminal start methionine. No polyadenylation signal (AATAAA) or 
polyadenylated region are present in the 3 'UTR. 

[0144] Multiple EST fragments (R51245) match the 3' end of the human ZC2 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5 5 end. 

Mammalian ZC3 

[0145] The partial human ZC3 cDNA (SEQ ID NO:l 1) is 4133 bp long and 
consists of a 3978 bp ORF (1-3978) and a 152 bp (3979-4133) 3'UTR region. Since the 
coding region is open throughout the 5' extent of this sequence, this is apparently a partial 
cDNA clone lacking the N-terminal start methionine. No polyadenylation signal 
(AATAAA) or polyadenylated region are present in the 3'UTR. 

[0146] Multiple EST fragments (R54563) match the 3'end of the human ZC3 gene, 
but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

Mammalian ZC4 

[0147] The full-length human ZC4 cDNA (SEQ ID NO: 104) is 3,684 bp long and 
was originally assembled from X chromosome genomic DNA sequence. 

[0148] Multiple EST fragments (R98571) match the 3'end of the human ZC4 gene, 
but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. ZC4 gene is also contained within the human genomic 
clone Z83850. 

Mammalian KHS2 

[0149] The full-length human KHS2 cDNA (SEQ ID NO: 17) is 4023 bp long and 
consists of a 2682 bp ORF (6-2687) flanked by a 5 bp (1-5) 5'UTR and a 1336 bp (2688- 
4023) 3' UTR. A potential polyadenylation signal (AATAAA) is found at positions 4008- 
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4013. No polyadenylated region is present in the 3'UTR. The sequence flanking the first 
ATG conforms to the Kozak consensus for an initiating methionine, and is believed to be 
the translational start site for human KHS2. 

[0150] Multiple EST fragments match the 5'end (AA446022) as well as the 3' end 
(R37625) of the human KHS2 gene. 

Mammalian SULU1 

[0151] The full-length human SULU1 cDNA (SEQ ID NO:19) is 4177 bp long and 
consists of a 2694 bp ORF (415-3108) flanked by a 414 bp (1-414) 5'UTR and a 1069 bp 
(3109-4177) 3' UTR followed by a 19 nucleotide polydenylated region. A potential 
polyadenylation signal (AATAAA) is found at positions 4164-4169. The sequence 
flanking the first ATG conforms to the Kozak consensus for an initiating methionine, and 
is believed to be the translational start site for human SULU1. 

[0152] Multiple EST fragments match the 5'end (N27153) as well as the 3' end 
(R90908) of the human SULU1 gene. 

Mammalian (Murine) SULU3 

[0153] The partial murine SULU3 cDNA (SEQ ID NO:21) is 2249 bp long and 
consists of a 2244 bp ORF (6-2249) flanked by a 5 bp (1-5) 5'UTR. The sequence 
flanking the first ATG conforms to the Kozak consensus for an initiating methionine, and 
is believed to be the translational start site for murine SULU3. The 3' end of the murine 
SULU3 cDNA shares 90% DNA sequence identity over 1620 nucleotides with human 
SULU3, suggesting that these two genes are functional orthologues. 

[0154] One EST fragment (AA446022) matches the 3' end of the partial murine 
SULU3 gene, but at the time of filing, the inventors believe that none exist in GenBank or 
the EST database that match its 5' end. 

Mammalian (Human) SULU3 

[0155] The partial human SULU3 cDNA (SEQ ID NO:20) is 3824 bp long and 
consists of a 2358 bp ORF (2-2359) flanked by a 1465 bp (2360-3824) 3'UTR followed by 
a 19 nucleotide polydenylated region. A potential polyadenylation signal (AATAAA) is 
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found at positions 2602-2607. Since the coding region is open throughout the 5' extent of 
this sequence, this is apparently a partial cDNA clone lacking the N-terminal start 
methionine. The 5' end of the human SULU3 cDNA shares 90% DNA sequence identity 
over 1620 nucleotides with murine SULU3, suggesting that these two genes are functional 
orthologues. 

[0156] Multiple EST fragments (R02283) match the 3'end of the human SULU3 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

Mammalian GEK2 

[0157] The full-length human GEK2 cDNA (SEQ ID NO:106) is 2962 bp long and 
consists of a 2737 bp ORF (59-2795) flanked by a 58 bp (1-58) 5'UTR. The sequence 
flanking the first ATG conforms to the Kozak consensus for an initiating methionine, and 
is believed to be the translational start site for human GEK2. 

[0158] Multiple EST fragments (AA465671) match the 5'end, but at the time of 
filing, the inventors believe that only one (AA3 80492) matches the 3' end of the human 
GEO gene. 

Mammalian PAK4 

[0159] The full-length human PAK4 cDNA (SEQ ID NO:27) is 3604 bp long and 
consists of a 2043 bp ORF (143-2185) flanked by a 142 bp (1-142) 5'UTR and a 1419 3' 
UTR followed by a 22 nucleotide polydenylated region. A potential polyadenylation 
signal (AATTAAA) is found at positions 3582-3588. The sequence flanking the first 
ATG conforms to the Kozak consensus for an initiating methionine, and is believed to be 
the translational start site for human PAK4. 

[0160] Multiple EST fragments (AA535791) match the 3'end of the human PAK4 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

Mammalian PAK5 

[0161] The full-length human PAK5 cDNA (SEQ ID NO:102) is 2806 bp long and 
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consists of a 1773 bp ORF flanked by a 201 bp 5' UTR and a 833 bp 3 ? UTR. The 
sequence flanking the first ATG conforms to the Kozak consensus (supra) for an initiating 
methionine, and is believed to be the translational start site for PAK5. 

[0162] Multiple EST fragments (AA442867) match the 3'end of the human PAK5 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

EL Nucleic Acid Probes, Methods, and Kits for Detection of STE20-Related Kinases. 

[0163] A nucleic acid probe of the present invention may be used to probe an 
appropriate chromosomal or cDNA library by usual hybridization methods to obtain other 
nucleic acid molecules of the present invention. A chromosomal DNA or cDNA library 
may be prepared from appropriate cells according to recognized methods in the art (cf. 
"Molecular Cloning: A Laboratory Manual", second edition, Cold Spring Harbor 
Laboratory, Sambrook, Fritsch, & Maniatis, eds., 1989). 

[0164] In the alternative, chemical synthesis can be carried out in order to obtain 
nucleic acid probes having nucleotide sequences which correspond to N-terminal and C- 
terminal portions of the amino acid sequence of the polypeptide of interest. The 
synthesized nucleic acid probes may be used as primers in a polymerase chain reaction 
(PCR) carried out in accordance with recognized PCR techniques, essentially according to 
PCR Protocols, "A Guide to Methods and Applications", Academic Press, Michael, et al. 9 
eds., 1990, utilizing the appropriate chromosomal or cDNA library to obtain the fragment 
of the present invention. 

[0165] One skilled in the art can readily design such probes based on the sequence 
disclosed herein using methods of computer alignment and sequence analysis known in the 
art ("Molecular Cloning: A Laboratory Manual", 1989, supra). The hybridization probes 
of the present invention can be labeled by standard labeling techniques such as with a 
radiolabel, enzyme label, fluorescent label, biotin-avidin label, chemiluminescence, and 
the like. After hybridization, the probes may be visualized using known methods. 

[0166] The nucleic acid probes of the present invention include RNA, as well as 
DNA probes, such probes being generated using techniques known in the art. The nucleic 
acid probe may be immobilized on a solid support. Examples of such solid supports 
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include, but are not limited to, plastics such as polycarbonate, complex carbohydrates such 
as agarose and sepharose, and acrylic resins, such as polyacrylamide and latex beads. 
Techniques for coupling nucleic acid probes to such solid supports are well known in the 
art. 

[0167] The test samples suitable for nucleic acid probing methods of the present 
invention include, for example, cells or nucleic acid extracts of cells, or biological fluids. 
The samples used in the above-described methods will vary based on the assay format, the 
detection method and the nature of the tissues, cells or extracts to be assayed. Methods for 
preparing nucleic acid extracts of cells are well known in the art and can be readily adapted 
in order to obtain a sample which is compatible with the method utilized. 

[0168] One method of detecting the presence of nucleic acids of the invention in a 
sample comprises (a) contacting said sample with the above-described nucleic acid probe 
under conditions such that hybridization occurs, and (b) detecting the presence of said 
probe bound to said nucleic acid molecule. One skilled in the art would select the nucleic 
acid probe according to techniques known in the art as described above. Samples to be 
tested include but should not be limited to RNA samples of human tissue. 

[0169] A kit for detecting the presence of nucleic acids of the invention in a 
sample comprises at least one container means having disposed therein the above- 
described nucleic acid probe. The kit may further comprise other containers comprising 
one or more of the following: wash reagents and reagents capable of detecting the presence 
of bound nucleic acid probe. Examples of detection reagents include, but are not limited 
to radiolabeled probes, enzymatic labeled probes (horseradish peroxidase, alkaline 
phosphatase), and affinity labeled probes (biotin, avidin, or steptavidin). 

[0170] In detail, a compartmentalized kit includes any kit in which reagents are 
contained in separate containers. Such containers include small glass containers, plastic 
containers or strips of plastic or paper. Such containers allow the efficient transfer of 
reagents from one compartment to another compartment such that the samples and 
reagents are not cross-contaminated and the agents or solutions of each container can be 
added in a quantitative fashion from one compartment to another. Such containers will 
include a container which will accept the test sample, a container which contains the probe 
or primers used in the assay, containers which contain wash reagents (such as phosphate 
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buffered saline, Tris-buffers, and the like), and containers which contain the reagents used 
to detect the hybridized probe, bound antibody, amplified product, or the like. One skilled 
in the art will readily recognize that the nucleic acid probes described in the present 
invention can readily be incorporated into one of the established kit formats which are well 
known in the art. 

III. DNA Constructs Comprising a STE20-Related Nucleic Acid Molecule and 
Cells Containing These Constructs. 

[0171] The present invention also relates to a recombinant DNA molecule 
comprising, 5* to 3', a promoter effective to initiate transcription in a host cell and the 
above-described nucleic acid molecules. In addition, the present invention relates to a 
recombinant DNA molecule comprising a vector and an above-described nucleic acid 
molecule. The present invention also relates to a nucleic acid molecule comprising a 
transcriptional region functional in a cell, a sequence complementary to an RNA sequence 
encoding an amino acid sequence corresponding to the above-described polypeptide, and a 
transcriptional termination region functional in said cell. The above-described molecules 
may be isolated and/or purified DNA molecules. 

[0172] The present invention also relates to a cell or organism that contains an 
above-described nucleic acid molecule and thereby is capable of expressing a polypeptide. 
The polypeptide may be purified from cells which have been altered to express the 
polypeptide. A cell is said to be "altered to express a desired polypeptide" when the cell, 
through genetic manipulation, is made to produce a protein which it normally does not 
produce or which the cell normally produces at lower levels. One skilled in the art can 
readily adapt procedures for introducing and expressing either genomic, cDNA, or 
synthetic sequences into either eukaryotic or prokaryotic cells. 

[0173] A nucleic acid molecule, such as DNA, is said to be "capable of 
expressing" a polypeptide if it contains nucleotide sequences which contain transcriptional 
and translational regulatory information and such sequences are "operably linked" to 
nucleotide sequences which encode the polypeptide. An operable linkage is a linkage in 
which the regulatory DNA sequences and the DNA sequence sought to be expressed are 
connected in such a way as to permit gene sequence expression. The precise nature of the 
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regulatory regions needed for gene sequence expression may vary from organism to 
organism, but shall in general include a promoter region which, in prokaryotes, contains 
both the promoter (which directs the initiation of RNA transcription) as well as the DNA 
sequences which, when transcribed into RNA, will signal synthesis initiation. Such regions 
will normally include those 5'-non-coding sequences involved with initiation of 
transcription and translation, such as the TATA box, capping sequence, CAAT sequence, 
and the like. 

[0174] If desired, the non-coding region 3' to the sequence encoding a kinase of the 
invention may be obtained by the above-described methods. This region may be retained 
for its transcriptional termination regulatory sequences, such as termination and 
polyadenylation. Thus, by retaining the 3 f -region naturally contiguous to the DNA 
sequence encoding a kinase of the invention, the transcriptional termination signals may be 
provided. Where the transcriptional termination signals are not satisfactorily functional in 
the expression host cell, then a 3' region functional in the host cell may be substituted. 

[0175] Two DNA sequences (such as a promoter region sequence and a sequence 
encoding a kinase of the invention) are said to be operably linked if the nature of the 
linkage between the two DNA sequences does not (1) result in the introduction of a frame- 
shift mutation, (2) interfere with the ability of the promoter region sequence to direct the 
transcription of a gene sequence encoding a kinase of the invention, or (3) interfere with 
the ability of the gene sequence of a kinase of the invention to be transcribed by the 
promoter region sequence. Thus, a promoter region would be operably linked to a DNA 
sequence if the promoter were capable of effecting transcription of that DNA sequence. 
Thus, to express a gene encoding a kinase of the invention, transcriptional and 
translational signals recognized by an appropriate host are necessary. 

[0176] The present invention encompasses the expression of a gene encoding a 
kinase of the invention (or a functional derivative thereof) in either prokaryotic or 
eukaryotic cells. Prokaryotic hosts are, generally, very efficient and convenient for the 
production of recombinant proteins and are, therefore, one type of preferred expression 
system for kinases of the invention. Prokaryotes most frequently are represented by 
various strains of E. coli. However, other microbial strains may also be used, including 
other bacterial strains. 
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[0177] In prokaryotic systems, plasmid vectors that contain replication sites and 
control sequences derived from a species compatible with the host may be used. Examples 
of suitable plasmid vectors may include pBR322, pUCl 18, pUCl 19 and the like; suitable 
phage or bacteriophage vectors may include TgtlO, ygtl 1 and the like; and suitable virus 
vectors may include pMAM-neo, pKRC and the like. Preferably, the selected vector of the 
present invention has the capacity to replicate in the selected host cell. 

[0178] Recognized prokaryotic hosts include bacteria such as E. coli, Bacillus, 
Streptomyces, Pseudomonas, Salmonella, Serratia, and the like. However, under such 
conditions, the polypeptide will not be glycosylated. The prokaryotic host must be 
compatible with the replicon and control sequences in the expression plasmid. 

[0179] To express a kinase of the invention (or a functional derivative thereof) in a 
prokaryotic cell, it is necessary to operably link the sequence encoding the kinase of the 
invention to a functional prokaryotic promoter. Such promoters may be either constitutive 
or, more preferably, regulatable (i.e., inducible or derepressible). Examples of constitutive 
promoters include the int promoter of bacteriophage X, the bla promoter of the /^-lactamase 
gene sequence of pBR322, and the cat promoter of the chloramphenicol acetyl transferase 
gene sequence of pPR325, and the like. Examples of inducible prokaryotic promoters 
include the major right and left promoters of bacteriophage X (P L and P R ), the trp, recA, 
^acZ, >acl, and gal promoters of E. coli, the a- amylase (Ulmanen et aL, J. Bacteriol. 
162:176-182, 1985) and the 9-28-specific promoters of B. subtilis (Gilman et al., Gene 
Sequence 32:1 1-20, 1984), the promoters of the bacteriophages of Bacillus (Gryczan, In: 
The Molecular Biology of the Bacilli, Academic Press, Inc., NY, 1982), and Streptomyces 
promoters (Ward et al, Mol. Gen. Genet. 203:468-478, 1986). Prokaryotic promoters are 
reviewed by Glick (Ind. Microbiot. 1:277-282, 1987), Cenatiempo (Biochimie 68:505-516, 
1986), and Gottesman (Ann. Rev. Genet. 18:415-442, 1984). 

[0180] Proper expression in a prokaryotic cell also requires the presence of a 
ribosome-binding site upstream of the gene sequence-encoding sequence. Such ribosome- 
binding sites are disclosed, for example, by Gold et al. (Ann. Rev. Microbiol. 35:365-404, 
1981). The selection of control sequences, expression vectors, transformation methods, 
and the like, are dependent on the type of host cell used to express the gene. As used 
herein, "cell", "cell line", and "cell culture" may be used interchangeably and all such 
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designations include progeny. Thus, the words "transformants" or "transformed cells" 
include the primary subject cell and cultures derived therefrom, without regard to the 
number of transfers. It is also understood that all progeny may not be precisely identical in 
DNA content, due to deliberate or inadvertent mutations. However, as defined, mutant 
progeny have the same functionality as that of the originally transformed cell. 

[0181] Host cells which may be used in the expression systems of the present 
invention are not strictly limited, provided that they are suitable for use in the expression 
of the kinase polypeptide of interest. Suitable hosts may often include eukaryotic cells. 
Preferred eukaryotic hosts include, for example, yeast, fungi, insect cells, mammalian cells 
either in vivo, or in tissue culture. Mammalian cells which may be useful as hosts include 
HeLa cells, cells of fibroblast origin such as VERO or CHO-K1, or cells of lymphoid 
origin and their derivatives. Preferred mammalian host cells include SP2/0 and J558L, as 
well as neuroblastoma cell lines such as IMR 332, which may provide better capacities for 
correct post-translational processing. 

[0182] In addition, plant cells are also available as hosts, and control sequences 
compatible with plant cells are available, such as the cauliflower mosaic virus 35S and 
19S, and nopaline synthase promoter and polyadenylation signal sequences. Another 
preferred host is an insect cell, for example the Drosophila larvae. Using insect cells as 
hosts, the Drosophila alcohol dehydrogenase promoter can be used (Rubin, Science 
240:1453-1459, 1988). Alternatively, baculovirus vectors can be engineered to express 
large amounts of kinases of the invention in insect cells (Jasny, Science 238:1653, 1987; 
Miller et ah, In: Genetic Engineering, Vol. 8, Plenum, Setlow et al. 9 eds., pp. 277-297, 
1986). 

[0183] Any of a series of yeast expression systems can be utilized which 
incorporate promoter and termination elements from the actively expressed sequences 
coding for glycolytic enzymes that are produced in large quantities when yeast are grown 
in mediums rich in glucose. Known glycolytic gene sequences can also provide very 
efficient transcriptional control signals. Yeast provides substantial advantages in that it 
can also carry out post-translational modifications. A number of recombinant DNA 
strategies exist utilizing strong promoter sequences and high copy number plasmids which 
can be utilized for production of the desired proteins in yeast. Yeast recognizes leader 
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sequences on cloned mammalian genes and secretes peptides bearing leader sequences 
(/.e., pre-pep tides). Several possible vector systems are available for the expression of 
kinases of the invention in a mammalian host. 

[0184] A wide variety of transcriptional and translational regulatory sequences may 
be employed, depending upon the nature of the host. The transcriptional and translational 
regulatory signals may be derived from viral sources, such as adenovirus, bovine 
papilloma virus, cytomegalovirus, simian virus, or the like, where the regulatory signals 
are associated with a particular gene sequence which has a high level of expression. 
Alternatively, promoters from mammalian expression products, such as actin, collagen, 
myosin, and the like, may be employed. Transcriptional initiation regulatory signals may 
be selected which allow for repression or activation, so that expression of the gene 
sequences can be modulated. Of interest are regulatory signals which are temperature- 
sensitive so that by varying the temperature, expression can be repressed or initiated, or are 
subject to chemical (such as metabolite) regulation. 

[0185] Expression of kinases of the invention in eukaryotic hosts requires the use 
of eukaryotic regulatory regions. Such regions will, in general, include a promoter region 
sufficient to direct the initiation of RNA synthesis. Preferred eukaryotic promoters include, 
for example, the promoter of the mouse metallothionein I gene sequence (Hamer et al., J. 
Mol. Appl. Gen. 1 :273-288, 1982); the TK promoter of Herpes virus (McKnight, Cell 
31:355-365, 1982); the SV40 early promoter (Benoist et a/., Nature (London) 290:304-31, 
1981); and the yeast gal4 gene sequence promoter (Johnston et aL, Proc. Natl. Acad. Sci. 
(USA) 79:6971-6975, 1982; Silver et al., Proc. Natl. Acad. Sci. (USA) 81:5951-5955, 
1984). 

[0186] Translation of eukaryotic mRNA is initiated at the codon which encodes the 
first methionine. For this reason, it is preferable to ensure that the linkage between a 
eukaryotic promoter and a DNA sequence which encodes a kinase of the invention (or a 
functional derivative thereof) does not contain any intervening codons which are capable 
of encoding a methionine (i.e., AUG). The presence of such codons results either in the 
formation of a fusion protein (if the AUG codon is in the same reading frame as the kinase 
of the invention coding sequence) or a frame-shift mutation (if the AUG codon is not in 
the same reading frame as the kinase of the invention coding sequence). 
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[0187] A nucleic acid molecule encoding a kinase of the invention and an operably 
linked promoter may be introduced into a recipient prokaryotic or eukaryotic cell either as 
a nonreplicating DNA or RNA molecule, which may either be a linear molecule or, more 
preferably, a closed covalent circular molecule. Since such molecules are incapable of 
autonomous replication, the expression of the gene may occur through the transient 
expression of the introduced sequence. Alternatively, permanent expression may occur 
through the integration of the introduced DNA sequence into the host chromosome. 

[0188] A vector may be employed which is capable of integrating the desired gene 
sequences into the host cell chromosome. Cells which have stably integrated the 
introduced DNA into their chromosomes can be selected by also introducing one or more 
markers which allow for selection of host cells which contain the expression vector. The 
marker may provide for prototrophy to an auxotrophic host, biocide resistance, e.g., 
antibiotics, or heavy metals, such as copper, or the like. The selectable marker gene 
sequence can either be directly linked to the DNA gene sequences to be expressed, or 
introduced into the same cell by co-transfection. Additional elements may also be needed 
for optimal synthesis of mRNA. These elements may include splice signals, as well as 
transcription promoters, enhancers, and termination signals. cDNA expression vectors 
incorporating such elements include those described by Okayama (Mol. Cell. Biol. 3:280-, 
1983). 

[0189] The introduced nucleic acid molecule can be incorporated into a plasmid or 
viral vector capable of autonomous replication in the recipient host. Any of a wide variety 
of vectors may be employed for this purpose. Factors of importance in selecting a 
particular plasmid or viral vector include: the ease with which recipient cells that contain 
the vector may be recognized and selected from those recipient cells which do not contain 
the vector; the number of copies of the vector which are desired in a particular host; and 
whether it is desirable to be able to "shuttle" the vector between host cells of different 
species. 

[0190] Preferred prokaryotic vectors include plasmids such as those capable of 
replication inE. coli (such as, for example, pBR322, ColEl, pSClOl, pACYC 184, DVX; 
"Molecular Cloning: A Laboratory Manual", 1989, supra). Bacillus plasmids include 
pC194, pC221, pT127, and the like (Gryczan, In: The Molecular Biology of the Bacilli, 
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Academic Press, NY, pp. 307-329, 1982). Suitable Streptomyces plasmids include pi J101 
(Kendall et ah, J. Bacteriol. 169:4177-4183, 1987), and streptomyces bacteriophages such 
as DC3 1 (Chater et al. 9 In: Sixth International Symposium on Actinomycetales Biology, 
Akademiai Kaido, Budapest, Hungary, pp. 45-54, 1986). Pseudomonas plasmids are 
reviewed by John et al. (Rev. Infect. Dis. 8:693-704, 1986), and Izaki (Jpn. J. Bacteriol. 
33:729-742, 1978). 

[0191] Preferred eukaryotic plasmids include, for example, BPV, vaccinia, SV40, 
2-micron circle, and the like, or their derivatives. Such plasmids are well known in the art 
(Botstein et al. 9 Miami Wntr. Symp. 19:265-274, 1982; Broach, In: 'The Molecular 
Biology of the Yeast Saccharomyces: Life Cycle and Inheritance", Cold Spring Harbor 
Laboratory, Cold Spring Harbor, NY, p. 445-470, 1981; Broach, Cell 28:203-204, 1982; 
Bollon et aL, J. Clin. Hematol. Oncol. 10:39-48, 1980; Maniatis, In: Cell Biology: A 
Comprehensive Treatise, Vol. 3, Gene Sequence Expression, Academic Press, NY, pp. 
563-608, 1980). 

[0192] Once the vector or nucleic acid molecule containing the construct(s) has 
been prepared for expression, the DNA construct(s) may be introduced into an appropriate 
host cell by any of a variety of suitable means, i.e., transformation, transfection, 
conjugation, protoplast fusion, electroporation, particle gun technology, calcium 
phosphate-precipitation, direct microinjection, and the like. After the introduction of the 
vector, recipient cells are grown in a selective medium, which selects for the growth of 
vector-containing cells. Expression of the cloned gene(s) results in the production of a 
kinase of the invention, or fragments thereof. This can take place in the transformed cells 
as such, or following the induction of these cells to differentiate (for example, by 
administration of bromodeoxyuracil to neuroblastoma cells or the like). A variety of 
incubation conditions can be used to form the peptide of the present invention. The most 
preferred conditions are those which mimic physiological conditions. 

IV . The Proteins of the Invention 

[0193] A variety of methodologies known in the art can be utilized to obtain the 
polypeptides of the present invention. The polypeptides may be purified from tissues or 
cells that naturally produce the polypeptides. Alternatively, the above-described isolated 
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nucleic acid fragments could be used to express the kinases of the invention in any 
organism. The samples of the present invention include cells, protein extracts or 
membrane extracts of cells, or biological fluids. The samples will vary based on the assay 
format, the detection method, and the nature of the tissues, cells or extracts used as the 
sample. 

[0194] Any eukaryotic organism can be used as a source for the polypeptides of the 
invention, as long as the source organism naturally contains such polypeptides. As used 
herein, "source organism" refers to the original organism from which the amino acid 
sequence of the subunit is derived, regardless of the organism the subunit is expressed in 
and ultimately isolated from. 

[0195] One skilled in the art can readily follow known methods for isolating 
proteins in order to obtain the polypeptides free of natural contaminants. These include, 
but are not limited to: size-exclusion chromatography, HPLC, ion-exchange 
chromatography, and immuno-affmity chromatography. 

Mammalian STLK2 

[0196] Analysis of the deduced amino acid sequence predicts STLK2 to be an 
intracellular serine/threonine kinase, lacking both a signal sequence and transmembrane 
domain. STLK2 contains a 21 amino acid N-terminal domain, a 253 amino acid catalytic 
domain with all the motifs characteristic of a serine/threonine kinase, followed by a 142 
amino acid C-terminal domain. 

[0197] STLK2 is most closely related to human STE20-subfamily kinases, MST3 
(GB:AF024636) and SOK-1 (GB:X99325) and a C. elegans kinase yk34bl 1.5 
(GB:U53153) sharing 72.7%, 68.7%, and 69.3% amino acid identity, respectively. 

[0198] The 21 amino acid N-terminal domain of human STLK2 is 71.4% identical 
to the N-terminus of MST3 (GB:AF024636). Human STLK2 lacks a glycine residue at 
position 2, and is therefore unlikely to undergo myristylation. A Smith- Waterman search 
of the nonredundant protein database does not reveal any significant homologies that 
might suggest a potential function for this domain. 

[0199] The 253 amino acid catalytic domain of human STLK2 is most related to 
human SOK-1 (X99325), MST3 (GB:AF024636), C. elegans yk32bll.5 (GB:U53153), 
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and STLK3 (SEQ ID NO:6) sharing 88.9%, 87.4%, 78.3%, and 49% amino identity 
respectively, placing it in the STLK-subfamily of STE20-related kinases. The STLK2 
kinase domain displayed lesser homology to other STE20-related kinases including: 
55.9% to human MST2 (GB:U26424), 49.2% to human GCK (GB:U07349), 49.2% to 
human KHS1 (GB:U77129), and 44.2% to human HPK1 (GB:U66464). The activation 
loop of human STLK2 catalytic domain is identical to that of human SOK-1 and MST3 
including the presence of four potential threonine phosphorylation sites that could serve an 
autoregulatory role on kinase activity. 

[0200] The 142 amino acid C-terminal domain of human STLK2 is most related to 
human SOK-1 (X99325), MST3 (GB:AF024636), and C elegans yk32bll.5 
(GB:U53153), sharing 39.9%, 39.9%, and 33.3% amino acid identity, respectively. This 
C-terminal domain shares some significant amino acid similarity to the C-terminal 
domains of the related human STLK3 (SEQ ID NO:6) and STLK4 (SEQ ID NO:7). 

[0201] The C-terminus of the related human SOK-1 (GB:X99325) kinase has been 
shown to be inhibitory to the catalytic activity of this kinase (Pombo, CM., Bonventre, 
J. V., Molnar, A., Kyriakis, J. and Force, T. EMBO J. 15, 4537-4546 (1996)). Based on the 
sequence identity between the C-termini of human SOK-1 (GB:X99325) and human 
STLK2 (39.2%), the C-terminus of human STLK2 may also function as an inhibitory 
domain for its kinase. 

Mammalian STLK3 

[0202] The 3030 bp human STLK3 nucleotide sequence of the partial cDNA clone 
encodes a polypeptide of 516 amino acids (SEQ ID NO:6) with a predicted molecular 
mass of 56,784 daltons. Analysis of the deduced amino acid sequence predicts STLK3 to 
be an intracellular serine/threonine kinase, lacking both a signal sequence and 
transmembrane domain, however the cDNA clone lacks an initiating ATG, so the full 
extent of it N-termius is not known. STLK3 contains a 31 amino acid N-terminal domain, 
a 277 amino acid catalytic domain with all the motifs characteristic of a serine/threonine 
kinase, followed by a 181 amino acid C-terminal domain containing a 25 amino acid insert 
and a 27 amino acid tail relative to the sequence of human STLK2. 

[0203] STLK3 is most closely related to human STE20-subfamily kinases, STLK4 
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(SEQ ID. NO:7), MST3 (GB:AF024636), SOK-1 (GB:X99325) and STLK2 (SEQ ID 
NO:5) sharing 71.1%, 37.6%, 38.1%, and 38.4% amino acid identity respectively. 

[0204] The 31 amino acid N- terminal domain of human STLK3 lacked any 
significant amino acid sequence homologies using a Smith-Waterman search of the 
nonredundant protein database, other than sequence similarity to proline-alanine repeats. 

[0205] The 277 amino acid catalytic domain of human STLK3 is most related to 
human STLK4 (SEQ ID NO:7), SOK-1 (GB:X99325), MST3 (GB:AF024636), and 
STLK2 (SEQ ED NO:5) sharing 88.2%, 49.2%, 49%, and 49% amino acid identity, 
respectively. It also shares strong homology to other STKs from lower organisms 
including 51.7% to A thaliana (GB: AC002343), 43.1% to A thaliana (GB: Z97336), 
42.1%) to A. thaliana (GB: U96613), and 43.3% to C. elegans (GB: U53153). The 
activation loop of the human STLK3 catalytic domain conserves three potential threonine 
phosphorylation sites with other members of the STLK-subfamily of STE20-related 
kinases (human STE20, MST3, STLK2, STLK4) that could serve an autoregulatory role 
on kinase activity. 

[0206] The 181 amino acid C-terminal domain of human STLK3 shares 55.5% 
amino acid identity to human STLK4 (SEQ ID NO:7), and is 100% identical to a partial 
human cDNA DCHT (GB:AF017635). The C-terminal domain of human STLK3 contains 
a 26 amino acid insert relative to human STE20. A similar (87.5% amino acid identity) 26 
amino acid insert is also present in human STLK4. 

[0207] The 27 amino acid C-terminal tail of human STLK3 shares 77.8% amino 
acid identity to human STLK4, but is absent from other STLK-family members. This high 
degree of homology between the C-tail of two STLK-family members suggests they may 
,be involved in an as yet unidentified protein-protein interaction. 

[0208] The weak sequence homology between the C-termini of human STLK3 and 
STE20, suggests it may also function as an inhibitory domain for its kinase. 

Mammalian STLK4 

[0209] The 3857 bp human STLK4 nucleotide sequence of the partial cDNA clone 
encodes a polypeptide of 414 amino acids (SEQ ID NO: 7) with a predicted molecular 
mass of 45,451 daltons. Analysis of the deduced amino acid sequence predicts STLK4 to 
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be an intracellular serine/threonine kinase, lacking both a signal sequence and 
transmembrane domain, however the cDNA clone lacks an initiating ATG, so the full 
extent of it N-terminus is not known. The partial STLK4 protein sequence contains a 178 
amino acid catalytic domain corresponding to the C-terminal motifs VI-XI of a 
serine/threonine kinase, followed by a 236 amino acid C-terminal domain containing two 
inserts of 25 and 41 amino acids each, relative to the sequence of human STLK2. 

[0210] STLK4 is most closely related to human STE20-subfamily kinases, STLK3 
(SEQ ID. NO 6), MST3 (GB:AF024636), STLK2 (SEQ ID NO:5), and SOK-1 
(GB:X99325) sharing 71.0%, 46.8%, 43.9%, and 37.7% amino acid identity, respectively. 

[021 1] The 178 amino acid catalytic domain of human STLK4 is most related to 
human STLK3 (SEQ ID NO. 7), SOK-1 (GB:X99325), MST3 (GB:AF024636), STLK2 
(SEQ ID NO:5), and MST1 (GB:U18297), sharing 88.2%, 54.2%, 54.0%, 53.7 and 45.7% 
amino acid identity, respectively. It also shares strong homology to other STKs from 
lower organisms including 56.9% to A. thaliana (GB: AC002343), 52.5% to C elegans 
(GB: U53153), 46.2% to A. thaliana (GB: Z97336) and 45.7% to A. thaliana (GB: 
U96613). The activation loop of the human STLK4 catalytic domain conserves three 
potential threonine phosphorylation sites with other members of the STLK-subfamily of 
STE20-related kinases (human STE20, MST3, STLK2 and STLK3) that could serve an 
autoregulatory role on kinase activity. 

[0212] The 236 amino acid C-terminal domain of human STLK4 shares 58.1% 
amino acid identity to both human STLK3 (SEQ ID NO: 6) and to a partial human cDNA, 
DCHT (GB:AF017635). The C-terminal domain of human STLK4 contains a 25 amino 
acid insert relative to human SOK-1 and shares 87.5% amino acid identity to an insert 
present in human STLK3. 

[0213] The weak sequence homology between the C-termini of human STLK4 and 
STE20, suggests it may also function as an inhibitory domain for its kinase. 

Mammalian STLK5 

[0214] The full-length 21 10 bp human STLK5 cDNA encodes a polypeptide of 
373 amino acids (SEQ ID NO:97) with a predicted molecular mass of 41,700 daltons. 
Analysis of the deduced amino acid sequence predicts STLK5 to be an intracellular 
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STE20-subfamily kinase, lacking both a signal sequence and transmembrane domain. 
STLK5 contains a 1 0 amino acid N- terminal domain, a 3 1 1 amino acid catalytic domain 
with all the motifs characteristic of a serine/threonine kinase, and a 52 amino acid C- 
terminal domain. 

[0215] STLK5 is most closely related to the human STE20-subfamily kinases 
STLK6 (SEQ ID No. 99) and SPAK (AF099989), sharing 51% and 33% amino acid 
identity, respectively, over its full extent. It also shares significant homology to database 
entries from Arabidopsis thaliana (GB:AC002343) and C.elegans (GB:AL023843, 
GB:AL023843). 

[0216] The 10 amino acid N-terminal domain of human STLK5 does not reveal 
any significant homologies to the protein database. 

[0217] The 311 amino acid catalytic domain of human STLK5 shares 51% and 34 
% identity to STLK6 and SPAK, respectively. The catalytic domain of STLK5 contains a 
45 amino acid insert between kinase subdomains X and XI relative to human STE20. 
Multiple human EST fragments as well as a murine EST (GB:AA5 75647) contain this 
insert providing evidence that this region is an integral part of STLK5. 

[0218] The 52 amino acid C-terminal tail of human STLK5 shares 41.3% amino 
acid identity to human SOK-1 (GB:X99325). The weak sequence homology between the 
C-termini of human STLK5 and STE20, suggests it may also function as an inhibitory 
domain for its kinase. 

Mammalian STLK6 

[0219] The 2,001 bp human STLK6 nucleotide sequence of the complete cDNA 
encodes a polypeptide of 418 amino acids (SEQ ID NO:99) with a predicted molecular 
mass of 47,025 daltons. Analysis of the deduced amino acid sequence predicts STLK6 to 
be an intracellular STE20-subfamily kinase, lacking both a signal sequence and 
transmembrane domain. STLK6 contains a 57 amino acid N-terminal domain, a 312 
amino acid catalytic domain with all the motifs characteristic of a serine/threonine kinase, 
followed by a 49 amino acid C-terminal domain. 

[0220] STLK6 is most closely related to human STE20-subfamily kinases STLK5 
(SEQ ID NO:97), STLK7 (SEQ ID NO:101), and SPAK (AF099989), sharing 50%, 35%, 
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and 30% amino acid identity over its full extent. It also shares significant homology to 
database entries from Arabidops is thaliana (GB:AC002343) and C.elegans (GB:U53153). 

[0221] The 57 amino acid N-terminal domain of human STLK6 does not reveal 
any significant homologies in the protein database. 

[0222] The 312 amino acid catalytic domain of human STLK6 shares 51 and 30 % 
identity to human STLK5 and SPAK, respectively. 

[0223] The 49 amino acid C-terminal tail of human STLK6 shares low amino acid 
sequence identity (29%) with STLK5 and SPAK. 

Mammalian STLK7 

[0224] The 3 1 1 bp human STLK7 nucleotide sequence of the partial cDNA 
encodes a polypeptide of 103 amino acids (SEQ ID NO: 101). Analysis of the deduced 
amino acid sequence predicts STLK7 to be an internal fragment of an intracellular STE20- 
family kinase. This sequence lacks the N- and C-terminal portions of STLK7 and contains 
only the N-terminal 103 amino acids of the predicted catalytic domain. 

[0225] Human STLK7 is most closely related to human STE20-subfamily kinases 
SPAK (AF099989), STLK5 (SEQ ID NO:97), and STLK6 (SEQ ID NO:99), sharing 
86%, 38%, and 35% amino acid identity within this region of the kinase domain. It also 
shares significant homology to database entries from Arabidopsis thaliana 
(GB:AC002343) and Drosophila melanogaster (GB:AF006640). 

Mammalian ZC1 

[0226] The 3798 bp human ZC1 nucleotide sequence encodes a polypeptide of 
1239 amino acids (SEQ ID NO:13) with a predicted molecular mass of 142,140 daltons. 
Analysis of the deduced amino acid sequence predicts ZC1 to be an intracellular 
serine/threonine kinase, lacking both a signal sequence and transmembrane domain. The 
full-length ZC1 protein contains a 22 amino acid N-terminus, a 267 amino acid catalytic 
domain with all the motifs characteristic of a serine/threonine kinase, a 237 amino acid 
region predicted to form a coiled-coil structure, a 114 amino acid proline-rich region, a 
256 amino acid spacer region, followed by a 343 amino acid C-terminal domain containing 
a potential Rab/Rho-binding region. 
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[0227] ZC1 is most closely related to the human STE20-subfamily kinases ZC2 
(SEQ ID NO:14), ZC3 (SEQ ID NO: 15), and ZC4 (SEQ ID NO:16), sharing 61.7%, 
60.9%, and 43.8% amino acid identity, respectively. ZC1 also shares 45.5% amino acid 
identity to a C. elegans kinase encoded by the cosmid ZC504.4 (GB:Z50029). ZC1 
exhibits 90.0% amino acid homology to murine NIK (GB:U88984), suggesting it may be 
the human orthologue of this STK. 

[0228] The 22 amino acid N-terminal domain of human ZC1 is 58.8% identical to 
the C. elegans kinase encoded by the cosmid ZC504.4 (GB:Z50029), and 100% identical 
to murine NIK (GB: U88984). Human ZC1 lacks a glycine residue at position 2, and is 
therefore unlikely to undergo myristylation. A Smith- Waterman search of the 
nonredundant protein database does not reveal any significant homologies that might 
suggest a potential function for this domain. 

[0229] The 267 amino acid catalytic domain of human ZC1 is most related to 
human STE20-subfamily kinases, ZC3 (SEQ ID NO: 15), ZC2 (SEQ ID NO: 14), KHS2 
(SEQ ID NO: 18), SOK-1 (GB:X99325), GCK (GB:U07349), and GEK2 (SEQ ID 
NO: 107), and to the C. elegans kinase encoded by the cosmid ZC504.4 (GB:Z50029) 
sharing 90.6%, 90.2%, 50.6%, 47.4%, 45.4%, 42.5% and 82.6% amino acid identity, 
respectively. The ZC1 kinase domain shares 98.1% identity to murine NIK (GB:U88984). 
ZC1 contains the potential "TPY" regulatory phosphorylation site in its activation loop. 
This "TPY" motif is conserved in other STE20-related kinases, including ZC2, ZC3, ZC4, 
GEK2, KHS2, SULU1, SULU3, PAK4 and PAK5. 

[0230] Immediately C-terminal to the kinase domain of human ZC1 is a 237 amino 
acid region predicted to form a coiled-coil structure based on the Lupas algorithm (Lupas, 
A. Meth. Enzymol. 266, 513-525 (1996)). This region of ZC1 is most related to human 
STE20-subfamily kinases, ZC3 (SEQ ID NO:15), ZC2 (SEQ ID NO:14), and GEK2 (SEQ 
ID NO:107), as well as to human PITSLRE (GB:U04824) sharing 65.5%, 65.4%, 25.3%, 
and 29.0% amino acid identity, respectively. The ZC1 coiled-coil domain also shares 
90.6% amino acid homology to murine NIK. The C. elegans homologue ZC504.4 shares 
32.2% sequence identity over this region. 

[0231] Within the predicted coiled-coil domain of human ZC1, and the related 
ZC3, is a region predicted to form a leucine zipper (Leu-X6-Leu-X6-Leu-X6-Leu-X20- 
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Leu-X6-Leu) (SEQ ID NO: 149). The fact that this leucine repeat exists within a predicted 
coiled-coil structure suggests that the leucine zipper may have a high probability of serving 
as a dimerization interface (Hirst, J.D. et al Protein Engineering 9 657-662 (1996)) 
mediating a potential inter- or intra-molecular dimerization of human ZC1. 

[0232] The 114 amino acid proline-rich region of human ZC1 is most related to 
human STE20-subfamily kinases, ZC2 (SEQ ID NO:14) and ZC3 (SEQ ID NO:15), 
sharing 35.8%, and 24.9%, respectively. The ZC1 proline-rich domain shares 36.4% amino 
acid homology to murine NIK (GB:U88984). Three potential "PxxP" (SEQ ID NO: 148) 
SH3 domain-binding motifs (I, II and III) are found within the proline-rich region of 
human ZC1. Motif I is conserved in human ZC1 and C. elegans ZC504.4 (GB.Z50029). 
Motif H is conserved in ZC1, ZC2, ZC3, ZC4 and C. elegans ZC504.4. Motif m is 
conserved in ZC1, ZC2, ZC3 and ZC4. Motifs II and m of murine NIK have been shown 
to bind the SH3 motif of the adaptor molecule Nek (Su, Y-C. et al, EMBO J. 16, 1279- 
1290 (1997)). From this evidence, human ZC1 may have the potential to bind to Nek or 
other SH3 or WW domain-containing proteins and participate in growth factor-induced 
signaling pathways. 

[0233] The 256 amino acid spacer region of human ZC1 is most related to human 
STE20-subfamily kinases, ZC2 (SEQ ID NO: 14) and ZC3 (SEQ ID NO: 15), as well as to 
human PITSLRE (GB:U04824), sharing 59.9%, 33.1%, 29.6%, and 26.4% amino acid 
identity, respectively. It also shares 59.9% amino acid homology to murine NIK. The C. 
elegans homologue ZC504.4 has only limited sequence similarity in this spacer region. 

[0234] The 343 amino acid C-terminal of human ZC1 is most related to human 
STE20-subfamily kinases, ZC3 (SEQ ID NO:15), ZC2 (SEQ ID NO:14), and ZC4 (SEQ 
ID NO: 16), sharing 89.2%, 88.9%, and 42.3%, amino acid identity, respectively. The ZC1 
C-terminal domain also shares 98.8% amino acid identity to murine NIK. The C. elegans 
homologue ZC504.4 also shares 68.7% amino acid identity with the C-tail of human ZC1. 
A lower, yet significant, homology is also evident to human KHS2 (SEQ ID NO: 18), 
GCK (GB:U07349), and murine citron (GB:U07349) with 26.6%, 23.1% and 36.2% 
amino acid identity, respectively. GCK is a STE20-family kinase whose C-terminal 
domain has been shown to bind the small G-protein Rab8 (Ren, M. et al., Proc. Natl. 
Acad. Sci. 93, 5151-5155 (1996)). Citron is a non-kinase Rho-binding protein (Madaule, 
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P. et aL, FEBS Lett. 377, 243-238 (1995)). 

[0235] The sequence similarity of the C-terminal region of ZC1 to proteins that 
have potential Rab- or Rho-binding domains suggests that ZC1 may signal through a small 
G-protein-dependant pathway. 

Mammalian ZC2 

[0236] The 4055 bp human ZC2 nucleotide sequence of the partial cDNA encodes 
a polypeptide of 1297 amino acids (SEQ ID NO: 14) with a predicted molecular mass of 
147,785 daltons. Analysis of the deduced amino acid sequence predicts ZC2 to be an 
intracellular serine/threonine kinase, lacking both a signal sequence and transmembrane 
domain, however the cDNA clone lacks an initiating ATG, so the full extent of it N- 
terminus is not known. The N-terminally truncated ZC2 protein contains a 255 amino 
acid catalytic domain with all the motifs characteristic of a serine/threonine kinase, a 187 
amino acid region predicted to form a coiled-coil structure, a 184 amino acid proline-rich 
region, a 328 amino acid spacer region, followed by a 343 amino acid C-terminal domain 
containing a potential Rab/Rho-binding region. 

[0237] ZC2 is most closely related to the human STE20-subfamily kinases ZC3 
(SEQ ID NO:15), ZC1 (SEQ ID NO:13), and ZC4 (SEQ ID NO: 16), sharing 88.3%, 
61.7%, and 41.9% amino acid identity, respectively, and shares 41.7% amino acid identity 
to a C elegans kinase encoded by the cosmid ZC504.4 (GB:Z50029). 

[0238] The 255 amino acid catalytic domain of human ZC2 is most related to 
human STE20-subfamily kinases, ZC1 (SEQ ID NO:13), ZC3 (SEQ ID NO:15), SOK-1 
(GB:X99325), KHS2 (SEQ ID NO: 18), MST1 (GB:U18297), and GCK (GB:U07349), 
and to the C. elegans kinase encoded by the cosmid ZC504.4 (GB:Z50029) sharing 90.2%, 
89.8%, 49.0%, 48.6%, 47.9%, 45.0 and 76.7% amino acid identity, respectively. ZC2 
contains the potential "TPY" regulatory phosphorylation site in its activation loop. This 
"TPY" motif is conserved in other STE20-related kinases, including ZC1, ZC3, ZC4, 
GEK2, KHS2, SULU1, SULU3, PAK4 and PAK5. 

[0239] Immediately C-terminal to the kinase domain of human ZC2 is a 187 amino 
acid region predicted to form a coiled-coil structure based on the Lupas algorithm (supra). 
This region of ZC2 is most related to human STE20-subfamily kinases, ZC1 (SEQ ID 
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NO: 13), ZC3 (SEQ ID NO: 15), and GEK2 (SEQ ID NO: 107), as well as to human 
PITSLRE (GB:U04824), sharing 65.8%, 61.5%, 29.7% and 29.6% amino acid identity, 
respectively. The C. elegans homologue ZC504.4 shares 30.8% sequence identity over 
this region. Human ZC2 lacks the potential leucine zipper found in ZC1 as a consequence 
of a 29 amino acid deletion relative to ZC1 and ZC3. 

[0240] The 184 amino acid proline-rich region of human ZC2 is most related to 
human STE20-subfamily kinases, ZC3 (SEQ ID NO:15) and ZC1 (SEQ ID NO:13), 
sharing 35.9% and 28.6%,amino acid identity, respectively. Significant homology is also 
evident to the murine WW domain-binding protein WBP7 (GB:U92455), and to the 
human SH3 domain-binding protein 3BP-1 (GB:X87671), with 27.7% and 25.3% amino 
acid identity, respectively. 

[0241] ZC2 contains two of the potential "PxxP" (SEQ ID NO: 148) SH3 domain- 
binding motifs (II and HI) found within the proline-rich region of human ZC1 . Motif II is 
conserved in ZC1, ZC3, ZC4 and C. elegans ZC504.4, and Motif III is conserved in ZC1, 
ZC3 and ZC4. Motifs II and m of murine NIK have been shown to bind the SH3 motif of 
the adaptor molecule Nek. From this evidence, human ZC1 may have the potential to bind 
to Nek or other SH3 or WW domain-containing proteins, and to participate in growth 
factor-induced signaling pathways. 

[0242] The 328 amino acid spacer region of human ZC2 is most related to human 
STE20-subfamily kinases ZC1 (SEQ ID NO:13) and ZC3 (SEQ ID NO:15), and to murine 
NIK (GB:U88984), sharing 31.6%, 26.9% and 25.9% amino acid identity, respectively. 
The C. elegans homologue ZC504.4 has only limited sequence similarity in this spacer 
region. 

[0243] The 343 amino acid C -terminal of human ZC2 is most related to human 
STE20-subfamily kinases ZC1 (SEQ ID NO: 13), ZC3 (SEQ ID NO: 15) and ZC4 (SEQ ID 
NO:16), and to murine NIK (GB:U88984), sharing 88.9%, 88.3%, 41.9%, and 88.0%, 
amino acid identity, respectively. The C. elegans homologue, ZC504.4, also shares 67.2% 
amino acid identity with the C-tail of human ZC2. A lower, yet significant, homology is 
also evident to human GCK (GB:U07349), murine citron (GB:U07349), and the S. 
cerevisiae ROM2 protein (GB:U19103), a Rhol GDP/GTP exchange factor, with 22.3%, 
22.2% and 21.9% amino acid identity, respectively. 
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[0244] The sequence similarity of the C-terminal region of ZC2 to proteins that 
have potential Rab- or Rho-binding domains suggests that ZC2, like ZC1, may also signal 
through a small G-protein-dependant pathway. 

Mammalian ZC3 

[0245] The 4133 bp human ZC3 nucleotide sequence of the partial cDNA encodes 
a polypeptide of 1326 amino acids (SEQ ID NO: 15) with a predicted molecular mass of 
149,906 daltons. Analysis of the deduced amino acid sequence predicts ZC3 to be an 
intracellular serine/threonine kinase, lacking both a signal sequence and transmembrane 
domain, however the cDNA clone lacks an initiating ATG, so the full extent of it N- 
termius is not known. The N-terminally truncated ZC3 protein contains a 255 amino acid 
catalytic domain with all the motifs characteristic of a serine/threonine kinase: a 221 
amino acid region predicted to form a coiled-coil structure, a 204 amino acid proline-rich 
region, and a 303 amino acid spacer region followed by a 343 amino acid C-terminal 
domain containing a potential Rab/Rho-binding region. 

[0246] ZC3 is most closely related to the human STE20-subfamily kinases ZC1 
(SEQ ID NO:13), ZC2 (SEQ ID NO: 14), and ZC4 (SEQ ID NO: 16), sharing 62.0%, 
61.0%, and 42.5% amino acid identity, respectively and shares 46.7% amino acid identity 
to a C. elegans kinase encoded by the cosmid ZC504.4 (GB:Z50029). 

[0247] The 255 amino acid catalytic domain of human ZC3 is most related to 
human STE20-subfamily kinases, ZC1 (SEQ ID NO:13), ZC2 (SEQ ID NO: 14), SOK-1 
(GB:X99325), KHS2 (SEQ ID NO:18), GCK (GB:U07349), SULU1 (SEQ ID NO:22), 
and GEK2 (SEQ ID NO: 107), and to the C. elegans kinase encoded by the cosmid 
ZC504.4 (GB:Z50029) sharing 90.6%, 89.3%, 49.0%, 48.3%, 45.0%, 43.1%, 42.3% and 
76.7% amino acid identity, respectively. ZC1 contains the potential "TPY" regulatory 
phosphorylation site in its activation loop. This "TPY" motif is conserved in other 
STE20-related kinases, including ZC1, ZC2, GEK2, KHS2, SULU1, SULU3, PAK4 and 
PAK5. 

[0248] Immediately C-terminal to the kinase domain of human ZC3 is a 221 amino 
acid region predicted to form a coiled-coil structure based on the Lupas algorithm (supra). 
This region of ZC3 is most homologous to human STE20-subfamily kinases, ZC1 (SEQ 
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ID NO:13), ZC2 (SEQ ID NO:14), and GEK2 (SEQ ID NO:107) 5 sharing 66.9%, 61.5%, 
and 21.5% identity, as well as to rat PLC-beta (GB:A45493) and human PITSLRE 
(GB:H54024) sharing 29.6% and 25.9% amino acid identity, respectively. The C. elegans 
homologue ZC504.4 shares 26.8% sequence identity over this region. 

[0249] Within the predicted coiled-coil domain of human ZC3, and the related 
ZC1, is a region predicted to form a leucine zipper (Leu-X6-Leu-X6-Leu-X6-Leu-X20- 
Leu-X6-Leu) (SEQ ID NO: 149). The fact that this leucine repeat exists within a predicted 
coiled-coil structure suggests that the leucine zipper may have a high probability of serving 
as a dimerization interface (Hirst, J.D. et al Protein Engineering 9 657-662 (1996)) 
mediating a potential inter- or intra-molecular dimerization of human ZC3. 

[0250] The 204 amino acid proline-rich region of human ZC3 is most related to 
human STE20-subfamily kinases, ZC1 (SEQ ID NO:13) and ZC2 (SEQ ID NO:14), 
sharing 66.9% and 61.5% amino acid identity, respectively. 

[0251] ZC3 contains two of the potential "PxxP" (SEQ ID NO: 148) SH3 domain- 
binding motifs (II and HI) found within the proline-rich region of human ZC1 . Motif II is 
conserved in ZC1, ZC2, ZC4 and C elegans ZC504.4; Motif m is conserved in ZC1, ZC2 
and ZC4. Motifs II and III of murine NIK have been shown to bind the SH3 motif of the 
adaptor molecule Nek. From this evidence, human ZC3 may have the potential to bind to 
Nek or other SH3 or WW domain-containing proteins and participate in growth factor- 
induced signaling pathways. 

[0252] The 303 amino acid acid spacer region of human ZC3 is most related to 
human STE20-subfamily kinases, ZC1 (SEQ ID NO:13) and ZC2 (SEQ ID NO:14) 
sharing 30.1%, and 27.1% amino acid identity, respectively. The C. elegans homologue 
ZC504.4 lacks nearly the entire spacer region of ZC3. 

[0253] The 343 amino acid C-terminal of human ZC3 is most related to human 
STE20-subfamily kinases, ZC1 (SEQ ID NO:13), ZC2 (SEQ ID NO:14) and ZC4 (SEQ ID 
NO: 16), sharing 89.2%, 88.9%, and 42.5%, amino acid identity, respectively. The C. 
elegans homologue ZC504.4 also shares 67.2% amino acid identity with the C-tail of 
human ZC3. A lower, yet significant, homology is also evident to human GCK 
(GB:U07349), as well as to the non-kinases murine citron (GB:U07349) and the S. 
cerevisiae ROM2 protein (GB:U19103), a Rhol GDP/GTP exchange factor, with 21.6%, 
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32.4% and 22.9% amino acid identity, respectively. 

[0254] The sequence similarity of the C-terminal region of ZC3 to proteins that 
have potential Rab- or Rho-binding domains suggests that ZC3, like ZC1 and ZC2, may 
signal through a small G-protein-dependant pathway. 

Mammalian ZC4 

[0255] The 3,684 bp human ZC4 nucleotide sequence of the complete cDNA 
encodes a polypeptide of 1,227 amino acids (SEQ ID NO: 105) with a predicted molecular 
mass of 138,205 Daltons. Analysis of the deduced amino acid sequence predicts ZC4 to 
be an intracellular STE20-subfamily kinase, lacking both a signal sequence and a 
transmembrane domain. The full-length ZC4 protein contains a 25 amino acid N-terminus, 
a 265 amino acid catalytic domain with all the motifs characteristic of a serine/threonine 
kinase, a 108 amino acid region predicted to form a coiled-coil structure, a 231 amino acid 
proline-rich region, a 40 amino acid region predicted to form a coiled-coil structure spacer 
region, a 204 amino acid spacer region (domain B), followed by a 355 amino acid C- 
terminal domain containing a potential Rab/Rho-binding region (domain C). 

[0256] ZC4 is most closely related to human ZC1 (SEQ ID NO: 13, also known as 
human HGK, human KIAA0687, murine NIK, human AC005035, human NIK, and C. 
elegans MIG-15), ZC2 (SEQ ID NO: 14, similar to partial sequence human KIAA0551), 
and ZC3 (SEQ ID NO: 15). An assembled genomic fragment in the database (Z83850) is 
identical to ZC4, except for inappropriate identification of the exon boundaries. (Abo et 
al. (1998) EMBO J. 17: 6527-6540.) 

[0257] The 25 amino acid N-terminal domain of human ZC4 shares weak 
homology to human ZC1 in its C-terminal extent, but otherwise does not reveal any 
significant homologies to the protein database. 

[0258] The 265 amino acid catalytic domain of human ZC4 is most related to 
human ZC1 (SEQ ID NO: 13), ZC3 (SEQ ID NO:15), and ZC2 (SEQ ID NO: 14), sharing 
63%, 64% and 62% amino acid identity, respectively. 

[0259] Immediately C-terminal to the kinase domain of human ZC4 is a 108 amino 
acid region predicted to form a coiled-coil structure based on the Lupas algorithm (supra). 
This region is most related to human ZC1 (SEQ ID NO:13), ZC3 (SEQ ID NO:15), and 
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ZC2 (SEQ ID NO: 14), sharing 29%, 25% and 20% amino acid identity, respectively. 

[0260] The 231 amino acid proline-rich region of human ZC4 does not reveal any 
significant homologies to the protein database. This region of ZC4 contains two "PxxP" 
(SEQ ID NO: 148) motifs that could potentially bind to proteins containing SH3 or WW 
domains and allow ZC4 to participate in growth factor activated signaling pathways. In 
addition, within the pro-rich domain of human ZC4 is a region predicted to form a leucine 
zipper (Leu-X6-Leu-X6-Leu-X6-Leu-X20-Leu-X6-Leu) (SEQ ED NO: 149) which may 
serve as a dimerization interface. The ZC STE20 subfamily kinases (ZC1, ZC2 and ZC3) 
have similarly located "PxxP' (SEQ ID NO: 148) motifs and potential Leu zippers. 

[0261] Immediately C-terminal to the proline-rich region of human ZC4 is a 40 
amino acid region also predicted to form a coiled-coil structure based on the Lupas 
algorithm. This region of human ZC4 does not reveal any significant homologies to the 
protein database. 

[0262] The 204 amino acid acidic- and serine-rich domain "B" of ZC4 does not 
reveal any significant homologies to the protein database. 

[0263] The 355 amino acid C-terminal of human ZC4 is most related to human 
ZC1 (SEQ ID NO:13), ZC3 (SEQ ID NO: 15), and ZC2 (SEQ ID NO: 14), sharing 43%, 
42% and 42% amino acid identity, respectively. 

[0264] The sequence similarity of the C-terminal region of ZC4 to proteins that 
have potential Rab- or Rho-binding domains suggests that ZC4, like other ZC-subfamily 
STE20 kinases, may signal through a small G-protein-dependant pathway. 

Mammalian KHS2 

[0265] The 4023 bp human KHS2 nucleotide sequence encodes a polypeptide of 
894 amino acids (SEQ ID NO: 18) with a predicted molecular mass of 101,327 daltons. 
Analysis of the deduced amino acid sequence predicts KHS2 to be an intracellular 
serine/threonine kinase, lacking both a signal sequence and transmembrane domain. The 
full-length KHS2 protein contains a 13 amino acid N-terminus, a 260 amino acid catalytic 
domain with all the motifs characteristic of a serine/threonine kinase, a 73 amino acid 
spacer region, a 188 proline-rich region, followed by a 360 amino acid C-terminal domain 
containing a potential Rab/Rho-binding site. 
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[0266] KHS2 is most closely related to the human STE20-subfamily kinases KHS1 
(GB:U177129), GCK (GB:U07349), and HPK1 (GB:U07349), sharing 65.5%, 51.9%, and 
44.9% amino acid identity, respectively and shares 38.5% amino acid identity to a C 
elegans STK (GB:U55363). 

[0267] The 13 amino acid N-terminal domain of human KHS2 does not reveal any 
significant homologies that might suggest a potential function for this domain when 
examined by a Smith- Waterman alignment to the nonredundant protein database. Human 
KHS2 lacks a glycine residue at position 2, and is therefore unlikely to undergo 
myristylation. 

[0268] The 260 amino acid catalytic domain of human KHS2 is most related to 
human STE20-subfamily kinases KHS1 (GB:U177129), GCK (GB:U07349), HPK1 
(GB:U66464), SOK-1 (GB:X99325), MST1 (GB:U18297), ZC1 (SEQ ID NO: 13), and to 
the C elegans kinase (GB:U55363), sharing 85.4%, 75.1%, 67.7%, 51.4%, 48.1%, 49.8% 
and 72.0% amino acid identity, respectively. KHS2 contains the potential "TPY" 
regulatory phosphorylation site in its activation loop. This "TPY" motif is conserved in 
other STE20-related kinases, including ZC1, ZC2, ZC3, ZC4, GEK2, SULU1, SULU3, 
PAK4 and PAK5. 

[0269] The 73 amino acid acid spacer region of human KHS2 is most related to 
human STE20-subfamily kinases, KHS1 (GB:U177129), HPK1 (GB:U66464) and GCK 
(GB:U07349), sharing 60.3%, 43.5% and 44.0%, amino acid identity, respectively. 

[0270] The 188 amino acid proline-rich region of human KHS2 is most related to 
human STE20-subfamily kinases, HPK1 (GB:U66464), GCK (GB:U07349) and KHS1 
(GB:U177129), sharing 33.3%, 31.9% and 31.4%, amino acid identity, respectively. 

[0271] Two potential "PxxP" (SEQ ID NO: 148) SH3 domain-binding motifs (I 
and II) are found within the proline-rich region of human KHS2. Motif I is conserved with 
human KHS1 and HPK1; motif H is conserved with GCK and KHS2. A 192 amino acid 
region of human HPK1 containing motif II has been shown to bind to the C-terminal SH3 
motif of the adaptor molecule Grb2 (Anafi, M et al, J. Biol. Chem. J. 272, 27804-2781 1 
(1997)). Human KHS2 may bind SH3 or WW domain-containing proteins through this 
proline-rich region. 

[0272] The 360 amino acid C-terminal of human KHS2 is most related to KHS1 
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(GB:U177129), GCK (GB:U07349) and HPK1 (GB:U66464), and to the C. elegans kinase 
(GB:U55363), sharing 74.9%, 54.8%, 42.9%, and 31.0%, amino acid identity, 
respectively. GCK is a STE20-family kinase whose C-terminal domain has been shown to 
bind the small G-protein Rab8 (Ren, M. et aL, Proc. Natl. Acad. Sci. 93, 5151-5155 
(1996)). 

Mammalian SULU1 

[0273] The 4196 bp human SULU1 nucleotide sequence encodes a polypeptide of 
898 amino acids (SEQ ED NO:22) with a predicted molecular mass of 105,402 daltons. 
Analysis of the deduced amino acid sequence predicts SULU1 to be an intracellular 
serine/threonine kinase, lacking both a signal sequence and transmembrane domain. The 
full-length SULU1 protein contains a 21 amino acid N-terminus, a 256 amino acid 
catalytic domain with all the motifs characteristic of a serine/threonine kinase, a 150 amino 
acid spacer region, a 210 amino acid region predicted to form a coiled-coil structure, a 1 14 
amino acid spacer region and a 147 amino acid C-terminal domain predicted to form a 
coiled-coil structure. 

[0274] SULU1 is most closely related to the STE20-sub family kinases murine 
SULU3 (SEQ ID NO:24), human SULU3 (SEQ ID NO:23), and to the C. elegans kinase 
SULU (GB:U1 1280), sharing 68.9%, 72.2% and 38.2% amino acid identity, respectively. 

[0275] The 21 amino acid N-terminal domain of human SULU1 is most related to 
murine SULU3 (SEQ ID NO:24) and to the C. elegans kinase SULU (GB:U1 1280), 
sharing 86.3% and 62.3% amino acid identity. Human SULU1 lacks a glycine residue at 
position 2, and is therefore unlikely to undergo myristoylation. A Smith- Waterman search 
of the nonredundant protein database does not reveal any significant homologies that 
might suggest a potential function for this domain. 

[0276] The 256 amino acid catalytic domain of human SULU1 is most related to 
murine SULU3 (SEQ ID NO:24), and to human SOK-1 (GB:X99325), STLK2 (SEQ ID 
NO:5), MST1 (GB:U18297), PAK1 (GB:U24152), ZC2 (SEQ ID NO:14), and KHS2 
(SEQ ID NO:18) sharing 86.3%, 48.1%, 46.9%, 45.2%, 43.3%, 43.1% and 42.0% amino 
acid identity, respectively. The C elegans SULU STK (GB:U1 1280) shares 62.3% 
sequence identity over this region. SULU1 contains the potential "TPY" regulatory 
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phosphorylation site in its activation loop. This "TPY" motif is conserved in other 
STE20-related kinases, including ZC1, ZC2, ZC3, ZC4, GEK2, KHS2, SULU3, PAK4 
and PAK5. 

[0277] The 150 amino acid spacer region of human SULU1 is most related to 
human SULU3 (SEQ ID NO:23) and to the C. elegans kinase (GB:U1 1280), sharing 
53.5% and 10.4% amino acid identity, respectively. 

[0278] Immediately C-terminal to the spacer region of human SULU1 is a 210 
amino acid region predicted to form a coiled-coil structure based on the Lupas algorithm. 
This region of SULU1 is most related to SULU3 (SEQ ID NO:23), the C. elegans SULU 
kinase (GB:U1 1280), GEK 2 (SEQ ID NO:107) and ZC1 (SEQ ID NO:13), sharing 
68.6%,26.8%,23.2%, and 22.8% amino acid identity, respectively. 

[0279] The 114 amino acid spacer region human SULU1 is most related to human 
SULU3 (SEQ ID NO:24) with 73.7% amino acid sequence identity. A lower, yet 
significant, homology is also evident to murine PITSLRE (GB:U04824) and DLK 
(GB:A55318), human ZC1 (SEQ ID NO:13) and GEK 2 (SEQ ID NO:107), as well as to 
the C. elegans SULU STK (GB:U11280), sharing 39.7%, 35.4%, 29.5%, 23.6% and 
37.6% amino acid identity, respectively. 

[0280] Immediately C-terminal to the second spacer region of human SULU1 is a 
147 amino acid region predicted to form a coiled-coil structure based on the Lupas 
algorithm. This region of SULU1 is most related to human SULU3 (SEQ ID NO:24), ZC1 
(SEQ ID NO: 13) and GEK 2 (SEQ ID NO: 107), as well as to the C elegans SULU STK 
(GB:U11280), sharing 73.3%, 28.4%, 26.1% and 39.5%, amino acid identity, respectively. 

Mammalian (human) SULU3 

[0281] The 3824 bp partial cDNA human SULU3 nucleotide sequence encodes a 
polypeptide of 786 amino acids (SEQ ID NO:23) with a predicted molecular mass of 
92,037 daltons. Analysis of the deduced amino acid sequence predicts SULU3 to be an 
intracellular serine/threonine kinase lacking a transmembrane domain. The N-terminally 
truncated human SULU3 protein contains a 66 amino acid partial catalytic domain 
followed by a 149 amino acid spacer region, a 210 amino acid region predicted to form a 
coiled-coil structure, a second spacer region of 1 14 amino acids, a 247 amino acid C- 
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terminal region predicted to form a second coiled-coil structure and a 100 amino acid C- 
terminal tail. 

[0282] Human SULU3 is most closely related murine SULU3 (SEQ ID NO:24), 
human SULU1 (SEQ ID NO:22), and to the C. elegans SULU kinase (GB:U1 1280), 
sharing 66.3%, 68.9% and 32.9% amino acid identity, respectively. The high sequence 
homology between murine and human SULU3 suggests that these two proteins are 
orthologs of each other. 

[0283] The 66 amino acid partial catalytic domain of human SULU3 is most 
related to murine SULU3 (SEQ ID NO:24), and to the human STE20 subfamily kinases 
ZC1 (SEQ ID NO:13), STE20 (GB:X99325), KHS1(GB:U177129) and GEK 2 (SEQ ID 
NO: 107), as well as to the C. elegans SULU kinase (GB:U1 1280), sharing 83.3%, 47.0%, 
45.5%, 43.5%,41.8% and 55.6% amino acid identity, respectively. 

[0284] The 149 amino acid spacer region of human SULU3 is most related to 
murine SULU3 (SEQ ID NO:24), human STE20 (GB:X99325), MST1 (GB:U18297), and 
to the C.elegans SULU kinase ( GB:U1 1280) sharing 98.7%, 21 .9% and 21.8% amino acid 
identity, respectively. 

[0285] Immediately C-terminal to the first spacer region of human SULU3 is a 210 
amino acid region predicted to form a coiled-coil structure based on the Lupas algorithm. 
This region of SULU3 is most related to murine SULU3 (SEQ ID NO:24), and to human 
SULU1 (SEQ ID NO:22), ZC1 (SEQ ID NO: 13) and GEK 2 (SEQ ID NO: 107), as well as 
to the C. elegans SULU kinase (GB:U1 1280), sharing 99. 5%, 68.6%, 27.4% and 22.5% 
amino acid identity, respectively. 

[0286] The 114 amino acid second spacer region of human SULU3 is most related 
to murine SULU3 (SEQ ID NO:24), and to human SULU1 (SEQ ID NO:22) GEK 2 (SEQ 
ID NO: 107), and ZC1 (SEQ ID NO: 13), as well as to the C. elegans SULU kinase 
(GB:U1 1280), sharing 99.1%, 73.7%, 24.6%,24.1% and 41.2% amino acid identity, 
respectively. 

[0287] Immediately C-terminal to the second spacer region of human SULU3 is a 
247 amino acid region predicted to form a coiled-coil structure based on the Lupas 
algorithm (supra). This region of SULU3 is most related to human SULU1 (SEQ ED 
NO:22) and ZC1 (SEQ ID NO: 13) as well as to rat PKN-(GB:D26180) murine pl60 
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ROCK1 (GB:U58512), and the C . elegans SULU kinase (GB:U1 1280), sharing 73.7%, 
26.7%, 24.0% and 21.0% amino acid identity, respectively. 

[0288] The 100 amino acid C-tail of human SULU3 is most related to a human 
prion protein (GB:L38993), with 45.0% amino acid identity. 

Mammalian (murine) SULU3 

[0289] The 2249 bp murine, partial cDNA SULU3 nucleotide sequence encodes a 
polypeptide of 748 amino acids (SEQ ID NO: 24) with a predicted molecular mass of 
87,520 daltons. Analysis of the deduced amino acid sequence predicts SULU3 to be an 
intracellular serine/threonine kinase, lacking both a signal sequence and transmembrane 
domain. The partial murine SULU3 protein contains a 25 amino acid N-terminus, a 248 
amino acid catalytic domain with all the motifs characteristic of a serine/threonine kinase, 
a 149 amino acid spacer region, a 210 amino acid region predicted to form a coiled-coil 
structure, and a 1 16 amino acid spacer region. 

[0290] Murine SULU3 is most closely related to human SULU3 (SEQ ID NO:23) 
and SULU1 (SEQ ID NO:22), as well as to the C. elegans SULU kinase (GB:U1 12 80), 
sharing 97.0%, 72.3% and 38.4% amino acid identity, respectively. The high sequence 
homology between murine and human SULU3 suggests that these two proteins are 
orthologs. 

[0291] The 25 amino acid N-terminal domain of murine SULU3 is most related to 
human SULU1 (SEQ ID NO:22) and to the C. elegans SULU kinase (GB:U1 1280), 
sharing 70.0% and 44.4% amino acid identity, respectively. 

[0292] Murine SULU3 lacks a glycine residue at position 2, and is therefore 
unlikely to undergo myristoylation. A Smith- Waterman search of the nonredundant 
protein database does not reveal any significant homologies that might suggest a potential 
function for this domain. 

[0293] The 248 amino acid catalytic domain of murine SULU3 is most related to 
human SULU1 (SEQ ID NO:22), STE20 (GB:X99325), ZC1 (SEQ ID NO: 13), and KHS1 
(GB:U77129), as well as to the C. elegans SULU kinase (GB:U1 1280), sharing 86.7%, 
46.6%, 43.3%, 59.4% amino acid identity, respectively. Murine SULU3 contains the 
potential "TPY" regulatory phosphorylation site in its activation loop. This "TPY" motif is 
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conserved in other STE20-related kinases, including ZC2, ZC3, ZC4, GEK2, KHS2, 
SULU1, SULU3, PAK4 and PAK5. 

[0294] The 149 amino acid spacer of murine SULU3 is most related to human 
SULU3 (SEQ ID NO:23), SULU1 (SEQ ID NO:22), and STE20 (GB:X99325), as well as 
to the C . elegans SULU (GB:U1 1280) and the S. cerevisiae STE20 (GB:L04655) kinases, 
sharing 98.7%, 53.4%, 21.9%, 59.4% and 21.9% amino acid identity, respectively. 

[0295] Immediately C-terminal to the spacer region of murine SULU3 is a 210 
amino acid region predicted to form a coiled-coil structure based on the Lupas algorithm. 
This region of murine SULU3 is most related to human SULU3 (SEQ ID NO:23), ZC1 
(SEQ ID NO:13), and GEK 2 (SEQ ID NO:107), as well as to the C. elegans SULU kinase 
(GB:U1 1280), sharing 99.5%, 27.4%, 22.5% and 29.2% amino acid identity, respectively. 

[0296] The 116 amino acid C-terminal spacer region of murine SULU3 is most 
related to human SULU3 (SEQ ID NO:23), GEK 2 (SEQ ID NO:107), and ZC1 (SEQ ID 
NO:13), well as to the C. elegans SULU kinase (GB:U1 1280), sharing 98.3%, 24.6%, 
24.1% and 40.5% amino acid identity, respectively. 

Mammalian (murine/human) SULU3 

[0297] The 2249 bp murine SULU3 and the 3824 bp human SULU3 cDNAs 
contain a 1620 nucleotide overlap (541 amino acids) with 90% and 98% DNA and amino 
acid sequence identity, respectively. Owing to the high degree of sequence identity in this 
extended overlap, we propose that these are functional orthologues of a single gene. The 
combined murine/human 4492 bp SULU3 sequence encodes a polypeptide of 1001 amino 
acids (SEQ ID NO:3 1) with a predicted molecular mass of 1 16,069 daltons. Analysis of 
the deduced amino acid sequence predicts SULU3 to be an intracellular serine/threonine 
kinase, lacking both a signal sequence and transmembrane domain. SULU3 contains a 25 
amino acid N-terminus, a 248 amino acid catalytic domain with all the motifs 
characteristic of a serine/threonine kinase, a 149 amino acid spacer region, a 210 amino 
acid region predicted to form a coiled-coil structure and a second spacer region of 1 14 
amino acids, a 247 amino acid C-terminal region predicted to form a second coiled-coil 
structure and a 100 amino acid C-terminal tail. The murine SULU3 clone lacks the region 
from the second C-terminal coiled-coil to the C-terminus, whereas the human clone lacks 
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the N-terminal domain, and all but 66 amino acids of the 248 amino acid kinase domain. 

[0298] SULU3 is most closely related to SULU1 (SEQ ID NO:22) and the C. 
elegans SULU kinase (GB:U11280) sharing 72.3% and 38.4% amino acid identity, 
respectively. 

[0299] The 25 amino acid N-terminal domain of SULU3 is most related to human 
SULU1 (SEQ ID NO:22) and to the C. elegans SULU kinase (GB:U1 1280), sharing 
70.0% and 44.4% amino acid identity, respectively. SULU3 lacks a glycine residue at 
position 2, and is therefore unlikely to undergo myristylation. A Smith- Waterman search 
of the nonredundant protein database does not reveal any significant homologies that 
might suggest a potential function for this domain. 

[0300] The 248 amino acid catalytic domain of SULU3 is most related to human 
SULU1 (SEQ ID NO:22), SOK-1 (GB:X99325), ZC1 (SEQ ID NO: 13), KHS1 
(GB:U77129) and the C. elegans SULU kinase (GB:U1 1280), sharing 86.7%, 46.6%, 
43.3%, 42.0% and 59.4% amino acid identity, respectively. SULU3 contains the potential 
"TPY" regulatory phosphorylation site in its activation loop. This "TPY" motif is 
conserved in other STE20-related kinases, including ZC2, ZC3, ZC4, GEK2, KHS2, 
SULU1, PAK4 and PAK5. 

[0301] The 149 amino acid spacer of SULU3 is most related to SULU1 (SEQ ID 
NO:22) and SOK-1 (GB:X99325), and to the C. elegans SULU (GB:U11280), and S. 
cerevisiae STE20 (GB:L04655) kinases, sharing 53.4%, 21.9%, 59.4% and 21.9% amino 
acid identity, respectively. 

[0302] Immediately C-terminal to the spacer region of SULU3 is a 210 amino acid 
region predicted to form a coiled-coil structure based on the Lupas algorithm. This region 
is most related to ZC1 (SEQ ID NO: 13), GEK 2 (SEQ ID NO:107), and the C. elegans 
SULU kinase (GB:U1 1280), sharing 27.4% 22.5% and 29.2% amino acid identity, 
respectively. 

[0303] The 1 14 amino acid spacer region of SULU3 is most related to human 
SULU1 (SEQ ID NO:22), GEK 2 (SEQ ID NO:107), ZC1 (SEQ ED NO: 13), and to the C. 
elegans SULU kinase (GB:U1 1280), sharing 73.7%, 24.6%, 24.1% and 41.2% amino acid 
identity, respectively. 

[0304] Immediately C-terminal to the second spacer region of SULU3 is a 247 
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amino acid region predicted to form a coiled-coil structure based on the Lupas algorithm. 
This region of SULU3 is most related to human SULU1 (SEQ ED NO:22) and ZC1 (SEQ 
ID NO: 13), as well as to rat PKN (GB:D26180), murine pl60 ROCK1 (GB:U58512) and 
the C. elegans SULU kinase (GB:U1 1280), sharing 73.7%, 26.7%, 24.0%, 21.0% and 
37.6% amino acid identity, respectively. 

[0305] The 100 amino acid C-tail of SULU3 is most related to a human prion 
protein (GB:L38993) with 45.0% amino acid identity. 

Mammalian GEK2 

[0306] The 2926 bp human GEK2 nucleotide sequence of the complete cDNA 
encodes a polypeptide of 968 amino acids (SEQ ID NO: 107) with a predicted molecular 
mass of 1 12,120 daltons. Analysis of the deduced amino acid sequence predicts GEK2 to 
be an intracellular serine/threonine kinase, lacking both a signal sequence and 
transmembrane domain. The complete GEK2 protein contains a 33 amino acid N- 
terminus, a 261 amino acid catalytic domain with all the motifs characteristic of a 
serine/threonine kinase, a 43 amino acid spacer region, a 135 amino acid proline-rich 
region, a 252 amino acid region predicted to form a coiled-coil structure followed by a 244 
amino acid region also predicted to form a coiled-coil structure. 

[0307] GEK2 is most closely related to rat AT1-46 (GB:U33472) (a partial cDNA 
that extends from the middle of the first potential coiled-coil domain of GEK2 to the C- 
terminus), murine LOK (GB:D89728), Xenopus laevis polo-like kinase 1 (GB:AF100165), 
and human SLK (GB:AB002804), sharing 91.3%, 88.5%, 65.0%, and 44.7% amino acid 
identity, respectively. The high sequence homology between human GEK2, murine LOK 
and rat AT 1-46 suggests that human GEK2 is a highly related protein to the rodent forms, 
or alternatively, its orthologue. Recently, a full-length version of GEK2 was reported 
(STK10 or human LOK AB015718). The 968 amino acid sequence is 99% identical to 
GEK2 (SEQ ID NO: 107). 

[0308] The 33 amino acid N-terminal domain of human GEK2 is most related to 
murine LOK (GB:D89728) and to human SLK (GB:AB002804), sharing 100% and 54.5% 
amino acid identity, respectively. 

[0309] Human GEK2 lacks a glycine residue at position 2, and is therefore unlikely 
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to undergo myristylation. A Smith- Waterman search of the nonredundant protein database 
does not reveal any significant homologies that might suggest a potential function for this 
domain. 

[0310] The 261 amino acid catalytic domain of human GEK2 is most related to 
murine LOK (GB:D89728), rat AT1-46 (GB:D89728) and human SLK (GB:AB002804) 
as well as to a C. elegans kinase (GB:Z81460), sharing 97.7%, 90.8%, 54.5% and 55.9% 
amino acid identity, respectively. GEK2 contains the potential "TPY" regulatory 
phosphorylation site in its activation loop. This "TPY" motif is conserved in other 
STE20-related kinases, including ZC2, ZC3, ZC4, GEK2, KHS2, SULU1, SULU3, PAK4 
and PAK5. 

[031 1] The 43 amino acid spacer region of human GEK2 is most related to murine 
LOK (GB:D89728) and to human SLK, sharing 83.7% and 77.6% amino acid identity, 
respectively. 

[0312] The 135 amino acid proline-rich region of human GEK2 is most related to 
murine LOK (GB:D89728) with 66.2% amino acid identity, respectively. Within the 
proline-rich region of human GEK2 is a potential "PxxP" (SEQ ID NO: 148) SH3-binding 
domain conserved with murine LOK. 

[0313] Immediately C-terminal to the proline-rich region of human GEK2 is a 252 
amino acid region predicted to form a coiled-coil structure based on the Lupas algorithm. 
This region of human GEK2 is most related to rat AT 1-46 (GB:D89728), murine LOK 
(GB:D89728) and human SLK (GB:AB002804), and ZC2 (SEQ ID NO: 14), sharing 
90.8%, 86.9%, 42.2%, and 29.7% amino acid identity, respectively. 

[0314] Immediately C-terminal to the predicted coiled-coil structure of human 
GEK2 is a second potential coiled-coil structure of 244 amino acids predicted based on the 
Lupas algorithm. This region of human GEK2 is most related to rat ATI -46 (GB:D89728) 
and murine LOK (GB:D89728) as well as to human SLK (GB:AB002804) and ZC1 (SEQ 
ID NO: 13), sharing 91.8%, 92.6%, 70.4% and 26.7% amino acid identity, respectively. 
The C. elegans kinase (GB:Z81460) shares 31.5% amino acid sequence identity over this 
region. 

Mammalian PAK4 
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[0315] The 3604 bp human PAK4 nucleotide sequence encodes a polypeptide of 
681 amino acids (SEQ ID NO:29) with a predicted molecular mass of 74,875 daltons. 
Analysis of the deduced amino acid sequence predicts PAK4 to be an intracellular 
serine/threonine kinase, lacking both a signal sequence and transmembrane domain. The 
full-length PAK4 protein contains a 51 amino acid N-terminus predicted to contain a rac- 
binding motif, a 1 73 amino acid insert relative to the known mammalian PAK proteins, a 
169 amino acid spacer region, a 265 amino acid catalytic domain with all the motifs 
characteristic of a serine/threonine kinase and a 23 amino acid C-terminal tail. 

[0316] PAK4 is most closely related to human PAK5 (SEQ ID NO:30), PAK1 
(GB: U24152), and PAK65 (GB:U25975), as well as to a C. elegans kinase (GB: Z74029), 
sharing 76.8%, 49.5%, 49.8%, and 34.6% amino acid identity, respectively. 

[0317] The 51 amino acid N-terminal domain of human PAK4 is most related to 
human PAK1 (GB:U24152), and PAK65 (GB:U25975), as well as to a C.elegans kinase 
(GB: Z74029), sharing 50.0%, 50.0% and 49.0% amino acid identity, respectively. 

[0318] The 10 amino acid region at positions 13-23 of human PAK4 fits the 
consensus for a Cdc42/Rac-binding motif (SXPX4-6HXXH) (SEQ ID NO: 150) (Burbelo, 
P.D., Dreschel, D. and Hall, A. J. Bio. Chem. 270, 29071-29074 (1995)). 

[0319] The 173 amino acid insert of human PAK4, relative to the known 
mammalian PAK proteins, is most related to a C. elegans kinase (GB: Z74029) with 
39.0% amino acid identity. A Smith- Waterman search of the nonredundant protein 
database does not reveal any significant homologies that might suggest a potential function 
for this region. 

[0320] The 169 amino acid spacer of human PAK4 does not reveal any significant 
homologies that might suggest a potential function for this region. 

[0321] The equivalent spacer region in PAK1 binds to the guanine nucleotide 
exchange factor PIX (Manser, E. et al (1998) Molecular Cell, 1, 183-192). Since PAK4 
differs substantially from PAK1 over this region, the spacer domain of PAK4 may differ in 
its guanine nucleotide exchange factor binding specificity, relative to PAK1. 

[0322] The 265 amino acid catalytic domain of human PAK4 is most related to 
human PAK5 (SEQ ID NO:30), PAK1 (GB:U24152), GCK (GB:U07349), SOK-1 
(GB:X99325), and SLK (GB:AB002804), as well as to the C. elegans (GB: Z74029), and 
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S. cerevisiae STE20-related kinases (GB:L04655), sharing 95.9%, 51.7%, 41.3%, 39.8%, 
37.4%, 60.2% and 47.9% amino acid identity, respectively. PAK4 contains the potential 
"TPY" regulatory phosphorylation site in its activation loop. This "TPY" motif is 
conserved in other STE20-related kinases, including ZC1, ZC2, ZC3, ZC4, GEK2, KHS2, 
SULU1, SULU3 and PAK5. 

[0323] The 23 amino acid C-tail of human PAK4 contains a sequence that is 
homologous to a G-protein beta subunit binding site (Leeuw, T. et al. Nature, 391, 191- 
195 (1998)). PAK4 has, therefore, the potential to be activated by both Cdc42- as well as 
G-protein-dependant pathways. 

Mammalian PAK5 

[0324] The 2,806 bp human PAK5 nucleotide sequence of the complete cDNA 
encodes a polypeptide of 591 amino acids (SEQ ID NO: 103) with a predicted molecular 
mass of 64,071 Daltons. Analysis of the deduced amino acid sequence predicts PAK5 to 
be an intracellular STE20-subfamily kinase, lacking both a signal sequence and 
transmembrane domain. The full-length PAK5 protein contains a 52 amino acid N- 
terminus predicted to contain a p21 (small G-protein) binding domain (PDB or CRIB), a 
121 amino acid insert relative to the known mammalian PAK proteins, a 134 amino spacer 
region, a 265 amino acid catalytic domain with all the motifs characteristic of a 
serine/threonine kinase and a 1 9 amino acid C-terminal tail. 

[0325] PAK5 is most closely related to Human PAK4 (SEQ ID NO:29), 
Drosophila melanogaster PAK (also known as "mushroom bodies tiny") (AJ01 1578), 
C45B1 1.1b from C. elegans (Z74029), and human PAK3 (Q13177) sharing 48% (327/674 
aa), 50% (330/651 aa), 43% (234/435 aa excluding gap), and 47% (190/405 aa excluding 
gap) amino acid identity, respectively. Recently, a full length version of PAK5 was 
reported (PAK4 AF005046) whose 591 amino acid sequence is identical to PAK5 (SEQ 
ID NO:103). (Abo, et al (1998) EMBO J. 17:6527-6540). 

[0326] The 52 amino acid N-terminal domain of human PAK5 is most related to 
human PAK4 (SEQ ID NO:29), Drosophila melanogaster PAK (AJ01 1578), C45B1 Lib 
from C. elegans (Z74029), and human PAK3 (Q13177), sharing 65%, 57%, 54%, and 53% 
amino acid identity, respectively. 
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[0327] The 10 amino acid region at positions 12-22 of human PAK5 (FIG. 18) fits 
the consensus for a small G-protein binding domain (PDB or CRIB) (SXPX4-6HXXH) 
(SEQ ID NO: 150) (Burbelo, P.D., Dreschel, D. and Hall, A. J. Bio. Chem. 270, 29071- 
29074 (1995), hereby incorporated by reference herein in its entirety including any figures, 
tables, or drawings.). 

[0328] The 121 amino acid insert of human PAK5 shares 43% amino acid identity 
with a similar domain from PAK4 (SEQ ID NO:29), but that is absent from other known 
PAKs. 

[0329] The equivalent spacer region in PAK1 binds to the guanine nucleotide 
exchange factor PIX (Manser, E. et al (1998) Molecular Cell, 1, 183-192 hereby 
incorporated by reference herein in its entirety including any drawings, figures, or tables.). 
Since PAKS differs substantially from PAK1 over this region, the spacer domain of PAK5 
may differ in its guanine nucleotide exchange factor binding specificity, relative to PAK1. 

[0330] The 134 amino acid collagen-like region of human PAK5 shares 34% 
amino acid identity to pro-a I type collagen from several species and is not present in other 
known PAKs. 

[0331] The 265 amino acid catalytic domain of human PAK5 is most related to 
human PAK4 (SEQ ID NO:29), Drosophila melanogaster PAK (AJ01 1578), C45B1 1.1b 
from C. elegans (Z74029), and human PAK3 (Q13177), sharing 78%, 80%, 61%, and 55% 
amino acid identity, respectively. PAK5 also contains the potential "TPY" regulatory 
phosphorylation site in its activation loop. This "TPY" motif is conserved in other 
STE20-related kinases, including ZC1, ZC2, ZC3, ZC4, GEK2, KHS2, SULU1, SULU3 
and PAK4. 

[0332] The 19 amino acid C-tail shares 80% amino acid identity to a PAK-like 
homologue identified from genomic DNA (AL031652). Furthermore, this C-terminal 
region of human PAKS contains a sequence that is homologous to a G-protein beta subunit 
binding site (Leeuw, T. et al. Nature, 391, 191-195 (1998) hereby incorporated by 
reference herein in its entirety including any figures, tables, or drawings). PAK5 has, 
therefore, the potential to be activated by both, Cdc42 as well as G-protein-dependant 
pathways. 
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V. Antibodies. Hybridomas, Methods of Use and Kits for Detection of STE20-Related 
Kinases 

[0333] The present invention relates to an antibody having binding affinity to a 
kinase of the invention. The polypeptide may have the amino acid sequence set forth in 
SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID 
NO: 15, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:29, 
SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, or 
SEQ ID NO: 107, or a functional derivative thereof, or at least 9 contiguous amino acids 
thereof (preferably, at least 20, 30, 35, or 40 or more contiguous amino acids thereof). 

[0334] The present invention also relates to an antibody having specific binding 
affinity to a kinase of the invention. Such an antibody may be isolated by comparing its 
binding affinity to a kinase of the invention with its binding affinity to other polypeptides. 
Those which bind selectively to a kinase of the invention would be chosen for use in 
methods requiring a distinction between a kinase of the invention and other polypeptides. 
Such methods could include, but should not be limited to, the analysis of altered kinase 
expression in tissue containing other polypeptides. 

[0335] The STE20-Related kinases of the present invention can be used in a 
variety of procedures and methods, such as for the generation of antibodies, for use in 
identifying pharmaceutical compositions, and for studying DNA/protein interaction. 

[0336] The kinases of the present invention can be used to produce antibodies or 
hybridomas. One skilled in the art will recognize that if an antibody is desired, such a 
peptide could be generated as described herein and used as an immunogen. The antibodies 
of the present invention include monoclonal and polyclonal antibodies, as well fragments 
of these antibodies, and humanized forms. Humanized forms of the antibodies of the 
present invention may be generated using one of the procedures known in the art such as 
chimerization or CDR grafting. 

[0337] The present invention also relates to a hybridoma which produces the 
above-described monoclonal antibody, or binding fragment thereof. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

[0338] In general, techniques for preparing monoclonal antibodies and hybridomas 
are well known in the art (Campbell, "Monoclonal Antibody Technology: Laboratory 
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Techniques in Biochemistry and Molecular Biology," Elsevier Science Publishers, 
Amsterdam, The Netherlands, 1984; St. Groth et aL, J. Immunol. Methods 35:1-21, 1980). 
Any animal (mouse, rabbit, and the like) which is known to produce antibodies can be 
immunized with the selected polypeptide. Methods for immunization are well known in 
the art. Such methods include subcutaneous or intraperitoneal injection of the polypeptide. 
One skilled in the art will recognize that the amount of polypeptide used for immunization 
will vary based on the animal which is immunized, the antigenicity of the polypeptide and 
the site of injection. 

[0339] The polypeptide may be modified or administered in an adjuvant in order to 
increase the peptide antigenicity. Methods of increasing the antigenicity of a polypeptide 
are well known in the art. Such procedures include coupling the antigen with a 
heterologous protein (such as globulin or /3-galactosidase) or through the inclusion of an 
adjuvant during immunization. 

[0340] For monoclonal antibodies, spleen cells from the immunized animals are 
removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma cells, and allowed to 
become monoclonal antibody producing hybridoma cells. Any one of a number of 
methods well known in the art can be used to identify the hybridoma cell which produces 
an antibody with the desired characteristics. These include screening the hybridomas with 
an ELISA assay, western blot analysis, or radioimmunoassay (Lutz et ah, Exp. Cell Res. 
175:109-124, 1988). Hybridomas secreting the desired antibodies are cloned and the class 
and subclass are determined using procedures known in the art (Campbell, "Monoclonal 
Antibody Technology: Laboratory Techniques in Biochemistry and Molecular Biology", 
supra, 1984). 

[0341] For polyclonal antibodies, antibody-containing antisera is isolated from the 
immunized animal and is screened for the presence of antibodies with the desired 
specificity using one of the above-described procedures. The above-described antibodies 
may be detectably labeled. Antibodies can be detectably labeled through the use of 
radioisotopes, affinity labels (such as biotin, avidin, and the like), enzymatic labels (such 
as horse radish peroxidase, alkaline phosphatase, and the like) fluorescent labels (such as 
FITC or rhodamine, and the like), paramagnetic atoms, and the like. Procedures for 
accomplishing such labeling are well-known in the art, for example, see Stemberger et al., 
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J. Histochem. Cytochem. 18:315, 1970; Bayer et aL, Meth. Enzym. 62:308-, 1979; Engval 
etal., Immunol. 109:129-, 1972; Goding, J. Immunol.JVleth. 13:215-, 1976. The labeled 
antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to 
identify cells or tissues which express a specific peptide. 

[0342] The above-described antibodies may also be immobilized on a solid 
support. Examples of such solid supports include plastics such as polycarbonate, complex 
carbohydrates such as agarose and sepharose, acrylic resins and such as polyacrylamide 
and latex beads. Techniques for coupling antibodies to such solid supports are well known 
in the art (Weir et aL, "Handbook of Experimental Immunology" 4th Ed., Blackwell 
Scientific Publications, Oxford, England, Chapter 10, 1986; Jacoby et al., Meth. Enzym. 
34, Academic Press, N.Y., 1974). The immobilized antibodies of the present invention 
can be used for in vitro, in vivo, and in situ assays as well as in immunochromotography. 

[0343] Furthermore, one skilled in the art can readily adapt currently available 
procedures, as well as the techniques, methods and kits disclosed herein with regard to 
antibodies, to generate peptides capable of binding to a specific peptide sequence in order 
to generate rationally designed antipeptide peptides (Hurby et al. 9 "Application of 
Synthetic Peptides: Antisense Peptides", In Synthetic Peptides, A User's Guide, W.H. 
Freeman, NY, pp. 289-307, 1992; Kaspczak et al. 9 Biochemistry 28:9230-9238, 1989). 

[0344] Anti-peptide peptides can be generated by replacing the basic amino acid 
residues found in the peptide sequences of the kinases of the invention with acidic 
residues, while maintaining hydrophobic and uncharged polar groups. For example, 
lysine, arginine, and/or histidine residues are replaced with aspartic acid or glutamic acid 
and glutamic acid residues are replaced by lysine, arginine or histidine. 

[0345] The present invention also encompasses a method of detecting a STE20- 
related kinase polypeptide in a sample, comprising: (a) contacting the sample with an 
above-described antibody, under conditions such that immunocomplexes form, and (b) 
detecting the presence of said antibody bound to the polypeptide. In detail, the methods 
comprise incubating a test sample with one or more of the antibodies of the present 
invention and assaying whether the antibody binds to the test sample. Altered levels of a 
kinase of the invention in a sample as compared to normal levels may indicate disease. 

[0346] Conditions for incubating an antibody with a test sample vary. Incubation 
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conditions depend on the format employed in the assay, the detection methods employed, 
and the type and nature of the antibody used in the assay. One skilled in the art will 
recognize that any one of the commonly available immunological assay formats (such as 
radioimmunoassays, enzyme-linked immunosorbent assays, diffusion based Ouchterlony, 
or rocket immunofluorescent assays) can readily be adapted to employ the antibodies of 
the present invention. Examples of such assays can be found in Chard ("An Introduction 
to Radioimmunoassay and Related Techniques" Elsevier Science Publishers, Amsterdam, 
The Netherlands, 1986), Bullock et aL ("Techniques in Immunocytochemistry," Academic 
Press, Orlando, FL Vol. 1, 1982; Vol. 2, 1983; Vol. 3, 1985), Tijssen ("Practice and 
Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and Molecular 
Biology," Elsevier Science Publishers, Amsterdam, The Netherlands, 1985). 

[0347] The immunological assay test samples of the present invention include 
cells, protein or membrane extracts of cells, or biological fluids such as blood, serum, 
plasma, or urine. The test samples used in the above-described method will vary based on 
the assay format, nature of the detection method and the tissues, cells or extracts used as 
the sample to be assayed. Methods for preparing protein extracts or membrane extracts of 
cells are well known in the art and can be readily be adapted in order to obtain a sample 
which is testable with the system utilized. 

[0348] A kit contains all the necessary reagents to carry out the previously 
described methods of detection. The kit may comprise: (i) a first container means 
containing an above-described antibody, and (ii) second container means containing a 
conjugate comprising a binding partner of the antibody and a label. In another preferred 
embodiment, the kit further comprises one or more other containers comprising one or 
more of the following: wash reagents and reagents capable of detecting the presence of 
bound antibodies. 

[0349] Examples of detection reagents include, but are not limited to, labeled 
secondary antibodies, or in the alternative, if the primary antibody is labeled, the 
chromophoric, enzymatic, or antibody binding reagents which are capable of reacting with 
the labeled antibody. The compartmentalized kit may be as described above for nucleic 
acid probe kits. One skilled in the art will readily recognize that the antibodies described 
in the present invention can readily be incorporated into one of the established kit formats 



002.1115275.1 



81 



Attorney Docket No. 034536/1034 



which are well known in the art. 

VI . Isolation of Compounds Which Interact With STE20-Related Kinases 

[0350] The present invention also relates to a method of detecting a compound 
capable of binding to a STE20-related kinase of the invention comprising incubating the 
compound with a kinase of the invention and detecting the presence of the compound 
bound to the kinase. The compound may be present within a complex mixture, for 
example, serum, body fluid, or cell extracts. 

[0351] The present invention also relates to a method of detecting an agonist or 
antagonist of kinase activity or kinase binding partner activity comprising incubating cells 
that produce a kinase of the invention in the presence of a compound and detecting 
changes in the level of kinase activity or kinase binding partner activity. The compounds 
thus identified would produce a change in activity indicative of the presence of the 
compound. The compound may be present within a complex mixture, for example, serum, 
body fluid, or cell extracts. Once the compound is identified it can be isolated using 
techniques well known in the art. 

[0352] The present invention also encompasses a method of agonizing 
(stimulating) or antagonizing kinase associated activity in a mammal comprising 
administering to said mammal an agonist or antagonist to a kinase of the invention in an 
amount sufficient to effect said agonism or antagonism. A method of treating diseases in a 
mammal with an agonist or antagonist of STE20-related kinase activity comprising 
administering the agonist or antagonist to a mammal in an amount sufficient to agonize or 
antagonize STE20-related kinase associated functions is also encompassed in the present 
application. 

[0353] In an effort to discover novel treatments for diseases, biomedical 
researchers and chemists have designed, synthesized, and tested molecules that inhibit the 
function of protein kinases. Some small organic molecules form a class of compounds that 
modulate the function of protein kinases. Examples of molecules that have been reported 
to inhibit the function of protein kinases include, but are not limited to, bis monocyclic, 
bicyclic or heterocyclic aryl compounds (PCT WO 92/20642, published November 26, 
1992 by Maguire et al.\ vinylene-azaindole derivatives (PCT WO 94/14808, published 
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July 7, 1994 by Ballinari et al.), l-cyclopropyl-4-pyridyl-quinolones (U.S. Patent No. 
5,330,992), styryl compounds (U.S. Patent No. 5,217,999), styryl-substituted pyridyl 
compounds (U.S. Patent No. 5,302,606), certain quinazoline derivatives (EP Application 
No. 0 566 266 Al), seleoindoles and selenides (PCT WO 94/03427, published February 
17, 1994 by Denny et al.\ tricyclic polyhydroxylic compounds (PCT WO 92/21660, 
published December 10, 1992 by Dow), and benzylphosphonic acid compounds (PCT WO 
91/15495, published October 17, 1991 by Dow et al). 

[0354] Compounds that can traverse cell membranes and are resistant to acid 
hydrolysis are potentially advantageous as therapeutics as they can become highly 
bioavailable after being administered orally to patients. However, many of these protein 
kinase inhibitors only weakly inhibit the function of protein kinases. In addition, many 
inhibit a variety of protein kinases and will cause multiple side-effects as therapeutics for 
diseases. 

[0355] Some indolinone compounds, however, form classes of acid resistant and 
membrane permeable organic molecules. WO 96/22976 (published August 1, 1996 by 
Ballinari et al.) describes hydrosoluble indolinone compounds that harbor tetralin, 
naphthalene, quinoline, and indole substituents fused to the oxindole ring. These bicyclic 
substituents are in turn substituted with polar moieties including hydroxylated alkyl, 
phosphate, and ether moieties. U.S. Patent Application Serial Nos. 08/702,232, filed 
August 23, 1996, entitled "Indolinone Combinatorial Libraries and Related Products and 
Methods for the Treatment of Disease" by Tang et al and 08/485,323, filed June 7, 1995, 
entitled ff Benzylidene-Z-Indoline Compounds for the Treatment of Disease" by Tang et al. 
and International Patent Publication WO 96/22976, published August 1, 1996 by Ballinari 
et al., all of which are incorporated herein by reference in their entirety, including any 
drawings, describe indolinone chemical libraries of indolinone compounds harboring other 
bicyclic moieties as well as monocyclic moieties fused to the oxindole ring. Applications 
08/702,232, filed August 23, 1996, entitled "Indolinone Combinatorial Libraries and 
Related Products and Methods for the Treatment of Disease" by Tang et al., 08/485,323, 
filed June 7, 1995, entitled "Benzylidene-Z-Indoline Compounds for the Treatment of 
Disease" by Tang et al., and WO 96/22976, published August 1, 1996 by Ballinari et al. 
teach methods of indolinone synthesis, methods of testing the biological activity of 
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indolinone compounds in cells, and inhibition patterns of indolinone derivatives. 

[0356] Other examples of substances capable of modulating kinase activity 
include, but are not limited to, tyrphostins, quinazolines, quinoxolines, and quinolines. 
The quinazolines, tyrphostins, quinolines, and quinoxolines referred to above include well 
known compounds such as those described in the literature. For example, representative 
publications describing quinazolines include Barker et al., EPO Publication No. 0 520 722 
Al; Jones et al., U.S. Patent No.4,447,608; Kabbe et al., U.S. Patent No. 4,757,072; Kaul 
and Vougioukas, U.S. Patent No. 5, 316,553; Kreighbaum and Comer, U.S. Patent No. 
4,343,940; Pegg and Wardleworth, EPO Publication No. 0 562 734 Al; Barker et al., 
Proc. of Am. Assoc. for Cancer Research 32:327 (1991); Bertino, J.R., Cancer Research 
3:293-304 (1979); Bertino, J.R., Cancer Research 9(2 part l):293-304 (1979); Curtin et al., 
Br. J. Cancer 53:361-368 (1986); Fernandes et al., Cancer Research 43:1 1 17-1 123 (1983); 
Ferris et al. J. Org. Chem. 44(2):173-178; Fry et al., Science 265:1093-1095 (1994); 
Jackman et al., Cancer Research 51:5579-5586 (1981); Jones et al. J. Med. Chem. 
29(6): 1 1 14-1 118; Lee and Skibo, Biochemistry 26(23):7355-7362 (1987); Lemus et al., L 
Org. Chem. 54:3511-3518 (1989); Ley and Seng, Synthesis 1975:415-522 (1975); 
Maxwell et al., Magnetic Resonance in Medicine 17:189-196 (1991); Mini et al., Cancer 
Research 45:325-330 (1985); Phillips and Castle, J. Heterocyclic Chem. 17(19):1489-1596 
(1980); Reece et al., Cancer Research 47(1 1):2996-2999 (1977); Sculier et al., Cancer 
Immunol, and Immunother. 23:A65 (1986); Sikora et al., Cancer Letters 23:289-295 
(1984); Sikora et al., Analytical Biochem. 172:344-355 (1988); all of which are 
incorporated herein by reference in their entirety, including any drawings. 

[0357] Quinoxaline is described in Kaul and Vougioukas, U.S. Patent No. 
5,316,553, incorporated herein by reference in its entirety, including any drawings. 

[0358] Quinolines are described in Dolle et al., J. Med. Chem. 37:2627-2629 
(1994); MaGuire, J. Med. Chem. 37:2129-2131 (1994); Burke et al., J. Med. Chem. 
36:425-432 (1993); and Burke et al. BioOrganic Med. Chem. Letters 2:1771-1774 (1992), 
all of which are incorporated by reference in their entirety, including any drawings. 

[0359] Tyrphostins are described in Allen et al., Clin. Exp. Immunol. 91 :141-156 
(1993); Anafi et al., Blood 82:12:3524-3529 (1993); Baker et al., J. Cell Sci. 102:543-555 
(1992); Bilder et al., Amer. Physiol. Soc. pp. 6363-6143:C721-C730 (1991); Brunton et 
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al., Proceedings of Amer. Assoc. Cancer Rsch. 33:558 (1992); Bryckaert et al., 
Experimental Cell Research 199:255-261 (1992); Dong et al., J. Leukocyte Biology 53:53- 
60 (1993); Dong et al., J. Immunol. 151(5):2717-2724 (1993); Gazit et al., J. Med. Chem. 
32:2344-2352 (1989); Gazit et al., " J. Med. Chem. 36:3556-3564 (1993); Kaur et al., 
Anti-Cancer Drugs 5:213-222 (1994); Kaur et al., King et al., Biochem. J. 275:413-418 
(1991); Kuo et al., Cancer Letters 74:197-202 (1993); Levitzki, A., The FASEB J. 6:3275- 
3282 (1992); Lyall et al., J. Biol. Chem. 264:14503-14509 (1989); Peterson et al., The 
Prostate 22:335-345 (1993); Pillemer et al., Int. J. Cancer 50:80-85 (1992); Posner et al., 
Molecular Pharmacology 45:673-683 (1993); Rendu et al.. Biol. Pharmacology 44(5):881- 
888 (1992); Sauro and Thomas, Life Sciences 53:371-376 (1993); Sauro and Thomas, J. 
Pharm. and Experimental Therapeutics 267(3): 119-1 125 (1993); Wolbring et al., J. Biol. 
Chem. 269(36):22470-22472 (1994); and Yoneda et al., Cancer Research 51 :4430-4435 
(1991); all of which are incorporated herein by reference in their entirety, including any 
drawings. 

[0360] Other compounds that could be used as modulators include oxindolinones 
such as those described in U.S. patent application Serial No. 08/702,232 filed August 23, 
1996, incorporated herein by reference in its entirety, including any drawings. 

VII . Biological Significance, Applications and Clinical 
Relevance of Novel STE20-Related Kinases 
Human STLK2, STLK3, STLK4, STLK5. STLK6, and STLK7 
[0361] STLK2, STLK4, STLK5, STLK6 and STLK7 belong to an expanding 
family of intracellular STKs that have varying degrees of sequence homology to SOK-1, a 
kinase implicated in oxidative stress agents (Pombo, CM et al, EMBO J. (17) 4537-4546, 
1996). Our data shows that STLK2 is expressed highly in hematopoietic cells. Therefore, 
STLK2 may participate in the oxidative response pathway during inflammation. In 
addition, STLK2 could also be a possible component in the signaling pathways leading to 
T cell activation. High levels of STLK2 in several tumor cell lines could also imply that 
STLK2 might be involved in tumorigenesis. 

[0362] STLK2 is most closely related to two human STE20-sub family kinases: 
MST3 and SOK-1. MST3 is a 52,000 daltons cytoplasmic kinase that is ubiquitously 
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expressed with its highest levels of expression found in heart, skeletal muscle and 
pancreas. The serine/threonine kinase activity of MST3 is activated by phosphorylation. 
Unlike SOK-1, MST3 prefers Mn^ over Mg^ and can use both GTP and ATP as 
phosphate donors. MST3 may undergo dimerization. No agonists have yet been identified 
that activate MST3. The downstream signaling mechanism of this kinase is unknown 
(Schinkmann, K and Blenis, J. (1997) J. Biol. Chem. 272, 28695-28703). 

[0363] SOK-1 is a 50,000 daltons cytoplasmic kinase expressed predominantly in 
testis, large intestine, brain and stomach and to a lesser extent in heart and lung. SOK-1 is 
also expressed in the germinal center B-cell line (RAMOS) and in a mature B cell line (HS 
Sultan). The serine/threonine kinase activity of SOK-1 is activated by phosphorylation. 
The C-terminus of SOK-1 has been shown to be inhibitory to the catalytic activity of this 
kinase. The only agonists known to activate SOK-1 are oxidant agents, like H 2 0 2 and 
menadione, a quinone that is a potent intracellular generator of reactive oxygen species 
(Pombo, CM. et al. EMBO J. 15, 4537-4546). SOK-1 is also activated by chemical 
anoxia through the generation of reactive oxygen species and release of calcium into the 
cytoplasm from intracellular stores. SOK-1, therefore, may play an important role in 
ischemia, the cause of myocardial infarction, stroke and acute renal failure (Pombo, CM. 
et al J. Biol. Chem. 272, 29372-29379 (1997)). The activity of SOK-1 in the response to 
oxidant stress is inversely correlated with the activity of the stress-activated protein 
kinases (SAPKs): elevated SOK-1 activity correlates with absent SAPK activity and vice- 
versa. SOK-1 does not activate any of the four MAP kinase pathways, SAPKs, p38, ERK- 
1 or MEK-5/ERK-5 (Pombo, CM. et al EMBO J. 15, 4537-4546). The downstream 
signaling mechanism of this kinase remains unknown. 

[0364] STLK2 is expressed in a wide variety of immune cell types and tissues 
including thymus, dendrocytes, mast cells, monocytes, B cells (primary, Jurkat, RPMI, 
SR), T cells (CD8/CD4+, TH1, TH2, CEM, MOLT4) and megakaryocytes (K562), 
whereas STLK3 is restricted to thymus and STLK4 is predominately expressed in thymus, 
T cells (CD4/CD8+, TH1, CEM) and B cells (Jurkat, RPMI). Consequently, these STKs 
might participate in the oxidative response pathway during inflammation, reperfusion 
injury (stroke, surgery, shock), TNFa-mediated signaling, insulin desensitization, 
atherogenesis, vascular injury, T or B cell costimulation, or alternatively, participate in 
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other MAPK-related signal transduction processes. 

[0365] STLK5 is more distantly related to this STE20-subfamily including SOK-1 
and STLK2, STLK3 and STLK4. STLK5, may therefore mediate a signaling pathway that 
is distinct from the oxidative stress response pathway. 

[0366] The high degree of sequence homology in the C-termini of SOK-1, STLK2, 
STLK3, STLK4, STLK5, and STLK6 raises the possibility that these novel STKs, like 
SOK-1, may be subject to autoinhibition through a conserved C-terminal motif 

Human ZCL ZC2, ZC3 and ZC4 

[0367] ZC1 is a good candidate for any disease in which tyrosine kinase, cytokine, 
or heterotrimeric G-protein coupled receptors have been implicated. The mouse 
homologue binds to NCK, and is recruited to activated PDGF (Su et al.,EMBO 16: 1279- 
1290, 1997). The Drosophila homolog has been shown to bind to TRAF2, implicating it 
in TNF-a signaling (Liu et al., (1999) Curr. Biol. 9:101-104, 1999)). While ZC1 does not 
contain the exact NCK- and TRAF2 -binding domains, it is likely to bind to related 
proteins. 

[0368] Of the ZC subfamily of STE20-related protein kinases, ZC1 has very broad 
over-expression in many tumor types, suggesting that it may be involved in cellular 
growth, transformation, or tumor progression. A truncated form of ZC1 containing only 
the C-terminal putative MEKK1 -binding domain was found to reduce the number of foci 
generated by H-Ras-V12 in Rat Intestinal Epithelial cells (RIE-1). These data indicate that 
ZC1 may play a role in the ability for these cells to overcome contact inhibition and 
anchorage-dependent growth. 

[0369] The ZC1 homolog, Misshapen {msn) in Drosophila melanogaster was 
cloned as a result of complementing a mutation in a developmental pathway required for 
dorsal closure, a process involving changes in cell shape and position in the embryo 
(Treisman et al.Gene 186 1 19-125, 1997). A D. melanogaster homolog of the 
JNK1/JNK2 kinases from mammals was shown to function downstream of msn in the 
dorsal-closure signaling pathway (Su et al. Genes Dev. 12:2371-2380, 1998). 

[0370] While ZC1 could be involved in multiple aspects of tumorigenesis, by 
analogy with Drosophila, the role of misshapen in dorsal closure suggests a critical role in 
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the regulation of the cytoskeleton for the processes of cell attachment, cell movement and 
perhaps migration. 

[0371] The association of the ZC1 family members msn and NIK with TRAF2 may 
indicate a role for this kinase in cell survival and/or in apoptosis. The ZC1 family contains 
a highly conserved domain that in the mouse homolog, NIK, has been shown to bind to 
MEKK1 (Mitogen-activated/ Extracellular-regulated Kinase Kinase 1) (Su et al., (1997) 
EMBO 16(6): 1279-90). MEKK1 is involved in cell survival and/or apoptosis in several 
systems (Schlesinger et al., Front. Biosci.3:Dl 181-6, 1998). Depending on the context, 
MEKK1 appears to be upstream of either the ERK1/MAPK or the JNK/SAPK pathway 
[Schlesinger et al., (1998 Front. Biosci. 3:D1 181-6). Three homologues of ZC1: murine 
NIK (NCK-interacting kinase)(Su et al.EMBO 16:1279-90, 1997), Drosophila msn (Liu et 
al.Curr. Biol. 9:101-104, 1999) and human HGK (HPK/GCK-like kinase)(Yao et al., J. 
Biol. Chem. 274:21 18-25, 1999) have all been shown to activate the JNK pathway when 
over-expressed in 293 T cells. 

[0372] ZC1 shares a high degree of homology with these other family members in 
both the kinase domain and the "MEKK M -binding domains, yet it differs in the intervening 
region, which contains several putative binding domains for upstream signaling adapter 
molecules (e.g. NCK, TRAF2). Unlike the other family members, ZC1 does not appear to 
activate the JNK pathway in 293T cells as seen by its ability to induce expression of either 
a JUN or ATF2-driven luciferase gene. Upon co-transfection into these cells with HA- 
tagged JNK, modest activation of JNK was detected. ZC1 also modestly activated co- 
trans fected ERK1 . Both the ERK and the JNK activation were very slight compared with 
the positive controls in the assay (activated forms of MEK1 and MEKK1, respectively). In 
both cases, activation required the full-length kinase. While the kinase domain alone is up 
to 5x more active in autophosphorylation and in phosphorylation of MBP, it does not lead 
to activation of these potential downstream kinases. Based on the strong sequence 
homology of ZC1 with the other family members, it is very likely that ZC1 will be 
important for either JNK or ERK activation once the proper context is found. 

[0373] ZC1 profoundly inhibits ERK1 kinase expression in co-transfection assays. 
This effect is dependent on ZC1 kinase activity, occurring with the wild-type and the 
kinase domain alone, but not with the kinase-dead mutant even though all three forms of 
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ZC1 are expressed at similar levels. This may suggest a role for this kinase in 
transcriptional or post-transcriptional regulation. 

[0374] ZC1 maybe an important component in the signaling pathways mediated 
by the co-stimulatory receptor CD28 in T cells and/or by the pro-inflammatory cytokine 
TNFa, since co-transfection of the wild-type ZC1 activated the RE/AP-luciferase and 
NFicB-luciferase reporter genes. While our data showed that ZC1 strongly activates NFkB 
in T-cells, no activation of NFkB driven luciferase was detectable in NIH 3T3 cells. A 
recent paper (J. Biol. Chem. 274:2118-25; 1999.) has shown that a human ZC1 splicing 
isoform, HGK, is involved in the TNFot-signaling pathways. 

[0375] Given the importance of T cell activation in autoimmunity and 
transplantation, as well as the key role that TNFa plays in inflammatory diseases, it is 
possible that ZC1 could be a therapeutic target for immunological diseases which include 
but are not limited to: rheumatoid arthritus, chronic inflammatory bowel diseases (ie 
Crohn's disease), chronic inflammatory pelvic disease, multiple sclerosis, asthma, 
osteoarthritis, psoriasis, atherosclerosis, rhinitis, and autoimmunity as well as organ 
transplantation and cardiovascular diseases. 

[0376] ZC1 appears to be the human orthologue of murine NIK and possibly an 
orthologue of a C. elegans STE20-sub family kinase encoded by the ZC504.4 cosmid. 

[0377] Murine NIK is a 140,000 daltons kinase that is most highly expressed in 
brain and heart. NIK interacts with the SH3 domains of the adaptor molecule Nek through 
its proline-rich regions found in the C-terminal extra-catalytic region. The specific regions 
that mediate this interaction are two PxxP (SEQ ID NO: 148) motifs that are nearly 
uniformly conserved between NIK, ZC1,2,3 and the C. elegans STE20 ZC504.4 kinase. In 
addition, NIK binds MEKK1 through its 719 amino acid C-terminal (Su, Y-C. et al. 
(1997) EMBO J. 16, 1279-1290). MEKK1 is a membrane-associated kinase responsible 
for activating MKK4 (also known as SEK1), which in turn activates SAPK (Yan, M et al. 
(1994) Nature, 372, 798-800). NIK may function as a kinase that links growth factor 
activated pathways and the stress-response pathway mediated by SAPKs. According to 
this hypothesis, activation of growth factor receptors leads to receptor tyrosine 
phosphorylation, Nek binding to the phosphorylated tyrosines via its SH2 domain, NIK 
redistribution to a membrane compartment via binding to the SH3 domain of Nek, and 
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juxtaposition to the membrane-associated MEKK1. The NDC-MEKK1 interaction would, 
in this fashion, turn on the SAPK pathway in response to growth factor stimulation (Su, Y- 
C. et al. (1997) EMBO J. 16, 1279-1290). 

[0378] Given the high homology between ZC1, ZC2, ZC3, and ZC4 STKs and 
NIK, it is conceivable that these kinases may each function to connect growth factor- and 
stress-activated signaling pathways. The heterogeneity that the ZC kinases exhibit within 
their putative SH3-binding domain could provide signaling specificity in terms of the 
nature of the adaptor molecule that they bind. The high level of sequence conservation in 
the C-termini of the ZC1, ZC2 and ZC3 strongly suggests that these human kinases, like 
murine NIK, also may bind to MEKK1 and activate S APKs. The ZC kinases also display 
strong homology at their C-termini to protein domains that bind small GTPase proteins 
such as Rab, Rho and Rac. For example, the C-termini of ZC1 is 36.2% identical to 
citron, a murine Rho-binding protein, and 23.1% identical to the rab-binding region of GC 
kinase. This suggests that, in addition to adaptor molecules, small GTPase proteins may 
also mediate membrane association and activation of the ZC kinases. The presence of a 
potential coiled-coil region located immediately C-terminal to the catalytic region strongly 
suggests that the ZC kinases may also be subject to regulation via homo or 
heterodimerization events. 

[0379] The C. elegans STE20 ZC504.4 kinase is the product of the mig-15 gene. 
The product of this gene has been implicated in several developmental processes such as 
epidermal development, Q neuroblast migrations and muscle arm targeting in the 
developing worm (Zhu, X. and Hedgecock E. (1997) Worm Breeder's Gazette 14, 76). 
The high level of sequence conservation between the ZC kinases and the ZC504.4 C. 
elegans kinase will make C. elegans a valuable model organism to study, through epistatic 
analysis, the signaling properties of the human ZC kinases. 

Human KHS2 

[0380] KHS1 (kinase homologous to SPS1/STE20) is a 100,000 dalton 
cytoplasmic STK that is expressed ubiquitously. KHS1 has been implicated in the 
mechanism of SAPK activation in response to inflammatory cytokines such as TNFD as 
well as to ultraviolight light, which also uses the TNF signaling pathway. TNFD binding 



002.1115275.1 



Attorney Docket No. 034536/1034 



to its receptors (TNFR1 and TNFR2) results in the sequential association with the receptor 
C-tail of multiple signaling molecules including TNFR1 -associated death domain protein 
(TRADD), Fas-associated death domain protein (FADD or MORT1), TNFR-associated 
factor 2 (TRAF2), and the STK RIP (receptor interacting protein). The TRADD-TRAF2 
interaction is mediated by a conserved region present at the C-terminus of TRAF2, the 
TRAF domain. Activation of the NFk£$ and S APK pathways is mediated by the ring 
finger motif present at the N-terminus of TRAF2 (Curr. Opinion in Cell. Biol. (1997) 
9:247-251). KHS1 is activated by TNFa stimulation in a TRAF2-dependant manner and 
inhibition of KHS1 blocks TNFa-induced SAPK activation but not NFDB activation. 
The mechanism by which TRAF2 activates KHS1 is not known. Cotransfection of 
TRAF2- and KHS1 -expressing constructs in 293T cells failed to reveal a direct association 
between these two molecules. KHS1 activates the SAPK pathway by a direct association 
with the constitutively active kinase MEKK1. MEKK1 subsequently activates SEK1, 
which in turn activates SAPK. Neither the MAPK nor the p38 kinase pathways are 
activated by KHS1 (Shi, C-S and Kehrl. J.H. (1997) J. Biol. Chem. 272, 32102-32107). 
In addition to its catalytic domain, downstream signaling of KHS1 requires its conserved 
C-terminus (Diener, K. et al (1997) Proc. Natl. Acad. Sci. 94, 9687-9692). 

[0381] GCK (germinal center kinase) is a constitutively active 97,000 dalton STK 
that is broadly expressed. GCK may participate in B-cell differentiation since its 
expression is localized to the germinal center within lymphoid follicles. GCK activates 
the SAPK pathway in response to TNFa via activation of SEK1 . The upstream activators 
of GCK in response to cytokines as well as the immediate downstream target of this kinase 
are unknown. The C-terminus of GCK is sufficient to activate SEK1 (Pombo, CM. et al 
(1995) Nature, 377, 750-754). 

[0382] The murine orthologue of GCK, rab8ip (rab8-interacting protein), is a 
97,000 dalton protein that fractionates with both the soluble cytoplasmic fraction as well 
as with a salt-sensitive fraction associated with the basolateral membrane of the trans- 
Golgi region in polarized MDCK epithelial cells. The C-terminus of rab8ip binds to rab8, 
a small GTP -binding protein required for vesicular transport from the Golgi apparatus 
(Ren, M. et al (1996) Proc. Natl. Acad. Sci. 93, 5151-5155). In addition to inducing the 
transcriptional activation of cytokines like IL2 via SAPK, GCK may also promote the rab- 

002.1115275.1 91 



Attorney Docket No. 034536/1034 



dependent release of secretory proteins in response to TNFa (Buccione, R. et al (1995) 
Mol. Bio. Cell 6, 291). 

[0383] HPK1 (hematopoietic protein kinase) is a constitutively active 90,000 
dalton STK restricted to hematopoietic cells. HPK1 activates the S APK pathway by 
directly binding to and activating MEKK1 (Hu, M. et al (1996) Genes and Dev. 10:2251- 
2264) as well as the ubiquitously expressed mixed-lineage kinase MLK-3 (Kiefer, F. et al 
(1996) EMBO J. 15:7013-7025). This function of HPK1 requires, in contrast to GCK, 
both its kinase domain as well as its C-terminus. The upstream activators of HPK1 remain 
unknown. HPK1 also plays a key role as a mediator of transforming growth factor-/3- 
(TGF/3) signaling. HPK1 activates the TGFb-activated kinase (TAK), which in turn 
stimulates the SAPK pathway by phosphorylating SEK1 (Wang W. et al (1997) J. Biol. 
Chem. 272:22771-22775). 

[0384] KHS2 is expressed in thymus, dendrocytes and monocytes. KHS2 could 
have a complementary function to that of KHS1 as a mediator of SAPK activation in the 
cellular response to inflammatory cytokines. KHS2 could have the potential to interact 
directly with TRAF2 since a STK with the predicted molecular weight of KHS2 
(approximately 101,000 daltons) is found in the TNFR-TRAF2 complex upon TNFD 
stimulation (VanArsdale, T. and Ware, C.F. (1994) J. Immunol. 153, 3043-3050). The 
presence of a putative binding domain for Rab or a Rab-like molecule at the C-terminus of 
KHS2 indicates that KHS2, in addition to having a potential role in the TRAF2-dependant 
TNFa cytokine response, could also mediate signaling events that utilize small GTPase 
proteins. Alternatively, the binding of a small GTPase protein to the C-terminus of KHS2 
may be required for its potential TRAF2 -dependant signaling to a downstream kinase such 
as MEKK1. 

Human GEK2. SULU1 and SULU3 

[0385] A recent report (Y-W Qian et al., Science 282:1701-1704,1998) described 
xPlkkl as the activator of Plxl (the Xenopus Polo kinase). In Xenopus oocytes, the STK 
Plkkl can phosphorylate and activate Plxl STK (the mammalian Polo kinase or PLK). A 
dominant-negative (kinase-dead) form of xPlkkl prevents Plxl activation and delays 
germinal vesicle breakdown. Yet another unidentified kinase is probably responsible for 
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xPlkkl activation during mitosis. 

[0386] The homology through the entire length of the xPlkkl protein with GEK2 
suggests that GEK2 might represent the human homologue for xPlkkl . Based on this, 
GEK2 might be upstream of PLK in mammalian cells. In addition, based on the phage 
display screen results using the SULU1 coiled-coil2 domain as bait, SULU1 might also 
interact in vivo with GEK2 and therefore regulate GEK2 (and/or SLK through the coiled- 
coil domain) activation leading to PLK activation and mitosis. 

[0387] If such a cascade of events is required for mitosis in mammalian cells, 
interruption of this signaling cascade at any point might block mitosis and could be 
beneficial for cancer treatment. 

[0388] A recently cloned STE20-subfamily kinase, rat TAOl, is most likely the 
rodent orthologue of human SULU3 (Hutchinson, M. et al. J. Biol. Chem 273:28625- 
28632, 1998). TAOl activates MEK3, 4 and 6 in vitro, while in transfected cells it 
associates and activates only MEK3, resulting in phosphorylation and activation of p38. 
These results implicate TAOl (SULU3) in the regulation of the p38 containing stress- 
responsive MAP kinase pathway. 

[0389] Human SULU1 is weakly expressed in hematopoietic sources whereas 
SULU3 is found in B-cells and TH1 -restricted T cells. These mammalian SULU STKs 
display strong homology to the C. elegans SULU kinase. The role that this kinase plays in 
nematode development is unknown. The strong sequence homology between the catalytic 
domain of mammalian SULU kinases and other STE20-subfamily kinases such as SOK-1 
(human STE20) and KHS2 suggests that the mammalian kinases may participate in the 
stress-response pathway. The potential coiled-coil domains found at the C-terminus of the 
SULU kinases may play a role in the regulation of this kinase. 
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[0390] Murine LOK (lymphocyte-oriented kinase) is a constitutively activated 
STK of approximately 130,000 daltons that is predominantly expressed in spleen, thymus 
and bone marrow (Kuramochi, S. et al (1997) J. Biol. Chem. 272: 22679-22684) as well 
as in meiotic testicular and primordial germ cells. The LOK1 gene is located in 
chromosome 1 1 of the mouse near the wr locus, a region that is associated with 
reproductive and neurological defects (Yanagisawa, M. et al (1996) Mol. Reprod. and 
Dev. 45:41 1-420). LOK does not activate any of the known MAPK pathways (ERK, JNK 
and p38) nor the NFkB pathway. The upstream signaling elements of LOK as well as the 
extracellular stimuli that utilize this kinase to elicit a biological response are also unknown 
(Kuramochi, S. et al (1997) J. Biol. Chem. 272: 22679-22684). 

[0391] Human GEK2 is highly related to murine LOK, but based on sequence 
divergence in the non-catalytic domain, it appears to be a distinct member of this STE20- 
subfamily. GEK2 may signal through a pathway that remains to be defined. The presence 
of potential coiled-coil regions at the C-terminus of GEK2 could play a key role in 
regulating the functions of this kinase. 

Human PAK4 and PAK5 

[0392] The p21 activated protein kinases (PAK) are a closely related subgroup of 
the STE20 family of serine/threonine kinases. Extensive genetic and biochemical analysis 
of the budding yeast STE20 has shown the critical role this serine/threonine kinase plays at 
the juncture of several important intracellular pathways required to appropriately respond 
to extracellular signals. STE20 links the transcriptional response by mediating the 
activation of the appropriate downstream MAPK pathway as well as coupling changes in 
cellular morphology via its control of the actin cytoskeleton. 

[0393] A hallmark of the PAK subgroup is their small G protein-binding domain 
(PBD) that confers G protein-dependent activation upon this group of kinases. Via the 
PBD, PAKs bind to activated small G proteins resulting in the derepression of the PAK's 
intrinsic kinase activity. 

[0394] Until recently, there were three known PAK kinases: PAK1, a 68 kD 
protein whose expression is restricted expression to brain, muscle, and spleen; PAK2 
(PAKI, PAK65), a 62 kD protein whose expression is ubiquitous; and PAK3, a 65kD 
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protein whose expression is restricted to the brain. Similar to STE20, the mammalian 
PAKs (1,2, and 3) have been shown to respond to extracellular signals (growth factors, 
mitogens, cytokines and a variety of cellular stresses) (Bagrodia, et al. (1995). J. Biol. 
Chem. 270: 22731-22737; Zhang, S., et al. (1995). J. Biol. Chem. 270: 23934-23936, 
Frost, J. et al. (1998) J. Biol. Chem. 273: 28191-28198; Galisteo, M. et al. (1996) J. Biol. 
Chem. 271: 20997-21000), and are linked to TCR activation (Yablonski, D., et al (1998) 
EMBO J. 17: 5647-5657), and heterotrimeric G protein-coupled receptors (Knaus, U. et al. 
(1995) Science 269: 221-223). 

[0395] The PAKs were originally identified as effectors for members of the Rho 
family of small G proteins (such as Rac and Cdc42), hence their name, p21 -activated 
kinases (PAK) (Manser et al Nature 367:40-46). The recruitment of the PAKs to the 
appropriate intracellular location is critical to their function. Attempts to elucidate the role 
played by PAKs in intracellular signaling and morphological changes is complicated due 
to the complex interactions by which they can be recruited by such factors as activated 
small G proteins (rac, cdc42), adaptors (nek) and exchange proteins (PIX, Cool). 

[0396] The adaptor molecule, Nek, is constitutively bound via its SH3 domain to 
the proline-rich motif in the N-terminal portion of PAK1 . Binding of the Nck-PAK 
complex to activated growth factor receptors in response to growth factor stimulation 
provides a mechanism to link growth factor-stimulated and stress-response pathways 
(Galisteo, M. et al. (1996) J. Biol. Chem. 271:20997-21000). 

[0397] The PBD found at the N-terminus of PAK1 is responsible for its high- 
affinity interaction with the GTP -bound forms of Cdc42 and Rac (Burbelo, P. et al (1995) 
J. Biol. Chem. 270:29071-29074). The exact mechanism through which the small 
GTPases activate PAKs may involve, in part, association of the kinase with activated 
growth factor receptors through guanine nucleotide exchange factors (GEFs). GEFs 
activate small GTPases by catalyzing the formation of their GTP -bound state, thereby 
promoting their association with, and activation of, PAKs. The known mammalian PAK 
kinases, as well as Drosophila and C. elegans PAKs, all conserve an N-terminal 
extracatalytic motif responsible for a high-affinity interaction with the GEF, PIX. The 
PAK-Cdc42 interaction and subsequent PAKs occurs as a PIX/PAK complex (Manser, E. 
et al (1998) Molecular Cell, 1, 183-192). 
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[0398] PAK signaling stimulated by heterotrimeric G proteins is mediated through 
the interaction between a short conserved amino acid region located at the C-terminus of 
PAK1 with the G-protein P-subunit (Leeuw, T. et a/.(1998) Nature, 391 : 191-195). 

[0399] A variety of studies have indicated that the human PAKs are involved in 
mediating the activation of stress-activated protein kinase pathways (JNK and to lesser 
extent p38). PAKs are also potential mediators in the crosstalk between the pathways 
regulated by the Rho family of small G proteins and the signaling pathways directly 
downstream of Ras leading to the activation of the ERK pathway (Bagrodia, et al. (1995). 
J. Biol. Chem. 270: 22731-22737; Zhang, S., et al (1995). J. Biol. Chem. 270: 23934- 
23936; Brown, J., et al. (1996) Curr Biol. 6:598-60596; Frost, J., et al (1996). Mol. Cell. 
Biol. 16: 3707-3713). 

[0400] PAK1 has been implicated in phosphorylating a regulatory site in MEK1 
that is necessary for MEK1 's ability to interact with Rafl (Frost, et al (1997) EMBO J. 
16:6426-6438). PAK3 has been shown to phosphorylate Rafl on a site that is important 
for Rafl activity (King, A., et al (1998). Nature 396: 180-183). 

[0401] PAKs play an important role in controlling morphological changes in cell 
shape mediated by the actin cytoskeleton. Such morphological changes are required for 
cellular functions ranging from cell division and proliferation to cell motility and vesicle 
transport. PAK activity has been implicated in the localized assembly (leading edge) and 
disassembly (retracting edge) of focal adhesions necessary for cell motility (Frost J. et al 
(1998) J. Biol. Chem. 273:28191-28198). 

[0402] PAK2 may have a role in the morphological changes induced during 
apoptosis (Membrane and morphological changes in apoptotic cells regulated by caspase- 
mediated activation of PAK2. (Rudel, T. (1997) Science. 276:1571-4)), and PAK1 maybe 
important in preventing apoptosis (Faure S, et al (1997) EMBO J. (1997) 16:5550-61). In 
addition to overcoming mitogen- and anchorage-independent growth, tumor cells need to 
escape the programmed cell death that accompanies deregulated cell growth. Thus, 
inhibition of PAKs may be effective in triggering apoptosis in tumors. 

[0403] A direct requirement for PAKs in the transformation of mammalian cells 
has been shown for PAK1 and PAK2. Kinase-dead alleles of PAK1 block ras 
transformation of RATI and Schwann cells (Tang, Y., et al. (1997) Mol. Cell. Biol. 17, 
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4454-4464). Dominant-negative alleles of PAK2 have been shown to interfere with ras- 
mediated transformation of mammalian cells (Osada, S., (1997) FEBS Lett 404:227-233) 

[0404] Mutations in PAK3 have been implicated in nonsyndromic X-linked mental 
retardation suggesting a role for PAK3 in cognitive function (Allen, K. et al. (1998) Nat. 
Genet. 20: 25-30). PAK1 has been implicated in neurite outgrowth in PC12 cells (Daniels, 
R. et al. (1998) EMBO J. 17: 754-764; Nikolic, M. et al. (1998) Nature 395:194-198). 

[0405] Finally, PAK-like STKs may also play a role in AIDS pathogenesis since 
the myristoylated 27kD membrane-associated HIV Nef gene product directly interacts with 
and activates these kinases via cdc42 and Rac. The Nef-mediated activation of PAK-like 
STKs correlates with the induction of high viral titers and the development of AIDS in 
infected hosts (Cullen, B. R. (1996) Curr. Biol. 6:1557-1559). 

[0406] Our results show that PAK4 is expressed in thymus, dendrocytes, mast 
cells, monocytes, as well as in T cells (TH2-restricted cells and MOLT4) and the B cell 
line RPMI. PAK5 is found in mast cells and in the T cell line MOLT4. These data suggest 
potential roles for PAK4 and PAK5 in the immune system. 

[0407] PAK4 and PAK5 share with the known PAKs a potential cdc42-binding 
motif at their N-termini. Both PAK4 and PAK5 display sequence homology in their C- 
termini to a motif responsible for an interaction between PAK1 and the P-subunit of 
heterotrimic G-proteins (amino acid residues 665-676 in PAK 4, and amino acid residues 
386-398 in PAKS). Consequently, PAK4, and possibly PAKS, could mediate signaling 
events originating from growth factors as well as from ligands that stimulate G-protein- 
linked receptors. 

[0408] PAK4 conserves a leucine (leu 44), that when mutated to a phenylalanine 
renders the kinase activity of human PAK1 constitutively active, bypassing its cdc42- 
binding requirement for activation (Brown J. et al (1996) Current Biol. 6:598-605). PAK5 
contains an isoleucine at the equivalent position. Therefore, the mechanism by which 
cdc42 potentially activates human PAK1, PAK4, and possibly PAKS, may be very similar. 

[0409] PAK4 and PAKS however, lack the PIX-binding motif, and consequently 
cdc42-activating GEFs other than PIX (for example Dbl and Cool) must be responsible for 
the activation of these kinases. Alternatively, PAK4 and PAK5 may be activated by 
another GTPase, such as Racl which uses the Tiaml GEF for its activation to the GTP- 
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bound state. 

[0410] PAK4 and PAK5 also lack the PxxP (SEQ ID NO: 148) motif responsible 
for the Nck-PAKl association. Between the PBD or cdc42 -binding N-terminal motifs and 
the putative GEF-binding regions, PAK4 and PAK5 have long insertions (185 and 123 
amino acids for PAK4 and PAK5, respectively) relative to PAK1. This region probably 
confers different binding characteristics to adaptor molecules and/or GEFs from those 
exhibited by known mammalian PAKs. 

[041 1] PAKs have been shown to be upstream in pathways leading to activation of 
both the JNK (Bagrodia, S., et al. (1995) J. Biol. Chem. 270: 22731-22737) and ERK 
kinase pathways (Brown, J., et al. (1996). Curr Biol. 6:598-605). PAK1 was shown to 
synergize with ras in activation of the ERK pathway through phosphorylation of MEK1 
(Frost, J. et al (1997). EMBO J. 16:6426-6438). Our data shows that MEK1 serves as an 
in vitro substrate for PAK4, suggesting a potential role for PAK4 in the activation of the 
ERK pathway and mitogenesis. 

[0412] PAKS may also have a mitogenic role, and be linked to cancer, based on its 
expression profile (elevated RNA and protein levels in a wide variety of tumor cell lines), 
its interaction with cdc42 via its PBD, and the ability of a kinase-dead allele (Lys350, 351 
Ala) to block ras transformation of NIH3T3 cells. Thus, a screen for small molecule 
inhibitors of PAKS kinase activity may yield compounds with therapeutic potential for 
intervention in cancer derived from a wide variety of tissue types. 

[0413] PAK4 and PAKS may also play a role in HIV pathogenesis as potential 
mediators of Nef signaling, since none of the known PAKs correspond to the PAK-like 
kinase shown to interact with, and be activated by, the HIV nef protein (Lu, X. et al. 
(1996) Current Biology 6:1677-1684) 

[0414] The 3' untranslated region of PAK4 contains a CA repeat that is prone to 
undergo expansion. CA dinucleotide repeat instability has been associated with disease 
(Toren, M.Z. et al (1998) Am. J. Hematol. 57: 148-152), and expansion of such repeat in 
the 3' untranslated region of PAK4 could implicate this kinase in as yet unknown 
pathologies. 

Clinical applications 
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Human STLKZ STLK3. STLK4, STLK5, STLK6. and STLK7 

[0415] STLK3, STLKS, STLK6 and STLK7, as well as other homologues of the 
STLK subfamily of STE20 protein kinases such as STLK4, may play an important role as 
mediators of the immune response. Thus, they are targets for the development of specific 
small molecule inhibitors to treat immunological diseases, including, but not limited to, 
rheumatoid arthritis, chronic inflammatory bowel diseases (e.g. Crohn's disease), chronic 
inflammatory pelvic disease, multiple sclerosis, asthma, osteoarthritis, psoriasis, 
atherosclerosis, rhinitis and autoimmunity, as well as in organ transplantation. Other 
diseases include cardiovascular diseases. 

[0416] The human STLKs may also play an important role in cell growth 
regulation. Thus, they are targets for developing small molecule kinase inhibitors for the 
treatment of cancer and metastases. STLKS maps to a chromosomal region frequently 
amplified in a variety of tumors including those from non-small cell lung cancer, breast 
cancer and peripheral nerve tumors. This suggests that STLK5 could play a role in the 
development, maintenance, or progression of human tumors. 

[0417] The potential role of human STLKs 2,3, and 4 in mediating oxidative stress 
strongly suggests that drugs targeting these kinases could prove useful in the treatment of 
myocardial infarction, arrhythmia and other cardiomyopathies, stroke, renal failure, 
oxidative stress-related neurodegenerative disorders such amyotrophic lateral sclerosis, 
Parkinson's disease and Leigh syndrome, a necrotizing mitochondrial encephalopathy, as 
well. 

Human ZCL ZC2, ZC3, and ZC4 

[0418] ZC1 may be a component of the CD28-signaling pathway and therefore 
important in T cell activation. As such, ZC1 as well as other ZC subfamily kinases, are 
targets for the development of specific small molecule inhibitors to treat immunological 
diseases, including, but not limited to, rheumatoid arthritis, chronic inflammatory bowel 
diseases (e.g. Crohn's disease), chronic inflammatory pelvic disease, multiple sclerosis, 
asthma, osteoarthritis, psoriasis, atherosclerosis, rhinitis and autoimmunity, as well as 
organ transplantation. Other diseases include cardiovascular diseases. 

[0419] ZC1 and ZC2 are also implicated in cell growth regulation. Thus, ZC 
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subfamily kinases are targets for developing small molecule inhibitors for the treatment of 
cancer and metastases. ZC2 maps to a chromosomal region frequently amplified in a 
variety of tumors including those from non-small cell lung cancer, small cell lung cancer, 
and cervical cancer. This suggests that ZC2 could play a role in the development, 
maintenance, or progression of human tumors. 

[0420] The role of human ZC1, ZC2, ZC3, and ZC4 in the inflammatory and 
stress-response pathways, strongly suggests that drugs targeting these kinases could have 
strong immunosuppressive actions. These drugs can prove valuable for the treatment of 
rheumatoid arthritis, artherosclerosis, autoimmune disorders and organ transplantation 
among others. At least one very important class of immunosuppresants, corticosteroids, 
functions by blocking S APK activation at an as yet undefined site on this pathway 
(Swantek, J.L. et al (1997) Mol. Cell. Biol. (1997) 6274-6282). Other immunosuppresive 
drugs like the pyridinyl imidazoles specifically target the p38 kinases (Kumar, S. et al 
(1997) Biochem. Biophys. Res. Commun. 235: 533-528). Drug targeting of the MAPK 
and p38 pathways could lead to the development of novel immunosuppresants. 

Human SULU and GEK 

[0421] The potential role of these novel STE20-related protein kinases in the 
control of mitosis strongly suggests that agents that specifically inhibit these kinases could 
be useful for cancer and metastases treatment. 

[0422] The close homology of human STLK5, GEK2, SULU1 and SULU3 to 
STE20-subfamily kinases involved in the stress and oxidative response pathway strongly 
suggests that drugs targeting these kinases may also be useful as immunosuppressants as 
well as to treat ischemic disorders. 

Human KHS2 

[0423] The role of human KHS2 in the inflammatory and stress-response 
pathways, strongly suggests that drugs targeting this and related kinases could have strong 
immunosuppressive actions. These drugs can prove valuable for the treatment of 
rheumatoid arthritis, artherosclerosis, autoimmune disorders and organ transplantation 
among others. At least one very important class of immunosuppresants, corticosteroids, 
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functions by blocking SAPK activation at an as yet undefined site on this pathway 
(Swantek, J.L. et al (1997) Mol. Cell. Biol. (1997) 6274-6282). Other immunosuppresive 
drugs like the pyridinyl imidazoles specifically target the p38 kinases (Kumar, S. et al 
(1997) Biochem. Biophys. Res. Commun. 235: 533-528). Drug targeting of the MAPK 
and p38 pathways could lead to the development of novel immunosuppressants. 

Human PAK family 

[0424] PAK5 has a role in cancer based on its expression profile (elevated RNA 
and protein levels in wide variety of tumor lines), its interaction with Cdc42 via its PBD, 
and the ability of the kinase-dead allele of PAK5 (Lys350, 351 Ala) to block ras 
transformation of NIH3T3 cells. Thus, a screen for small molecule inhibitors of PAK5 
kinase activity may yield compounds with therapeutic potential for intervention in cancers 
and metastases derived from a wide range of tissue types. 

[0425] PAK5 maps to a chromosomal region frequently amplified in a variety of 
tumors including those from non-small cell lung cancer, and small cell lung cancer. These 
findings suggest that PAK5 could play a role in the development, maintenance, or 
progression of human tumors and/or metastases. 

[0426] The role of human PAK4, and PAK5 in the inflammatory and stress- 
response pathways also strongly suggests that drugs targeting these kinases could have 
strong immunosuppressive actions. These drugs can prove valuable for the treatment of 
rheumatoid arthritis, artherosclerosis, autoimmune disorders and organ transplantation 
among others. At least one very important class of immunosuppresants, corticosteroids, 
functions by blocking SAPK activation at an as yet undefined site on this pathway 
(Swantek, J.L. et al (1997) Mol. Cell. Biol. (1997) 6274-6282). Other immunosuppresive 
drugs like the pyridinyl imidazoles specifically target the p38 kinases (Kumar, S. et al 
(1997) Biochem. Biophys. Res. Commun. 235: 533-528). Drug targeting of the MAPK 
and p38 pathways could lead to the development of novel immunosuppresants. In addition, 
drugs targeting PAK4 or PAK5 could prove useful as immunosuppresants as well as in 
AIDS treatment. 

VIII . Transgenic Animals . 
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[0427] A variety of methods are available for the production of transgenic animals 
associated with this invention. DNA can be injected into the pronucleus of a fertilized egg 
before fusion of the male and female pronuclei, or injected into the nucleus of an 
embryonic cell (e.g., the nucleus of a two-cell embryo) following the initiation of cell 
division (Brinster et ah, Proc. Nat. Acad. Sci. USA 82: 4438-4442, 1985). Embryos can 
be infected with viruses, especially retroviruses, modified to carry inorganic-ion receptor 
nucleotide sequences of the invention. 

[0428] Pluripotent stem cells derived from the inner cell mass of the embryo and 
stabilized in culture can be manipulated in culture to incorporate nucleotide sequences of 
the invention. A transgenic animal can be produced from such cells through implantation 
into a blastocyst that is implanted into a foster mother and allowed to come to term. 
Animals suitable for transgenic experiments can be obtained from standard commercial 
sources such as Charles River (Wilmington, MA), Taconic (Germantown, NY), Harlan 
Sprague Dawley (Indianapolis, IN), etc. 

[0429] The procedures for manipulation of the rodent embryo and for 
microinjection of DNA into the pronucleus of the zygote are well known to those of 
ordinary skill in the art (Hogan et al. 9 supra). Microinjection procedures for fish, 
amphibian eggs and birds are detailed in Houdebine and Chourrout (Experientia 47: 897- 
905, 1991). Other procedures for introduction of DNA into tissues of animals are 
described in U.S. Patent No., 4,945,050 (Sandford et al, July 30, 1990). 

[0430] By way of example only, to prepare a transgenic mouse, female mice are 
induced to superovulate. Females are placed with males, and the mated females are 
sacrificed by CO2 asphyxiation or cervical dislocation and embryos are recovered from 
excised oviducts. Surrounding cumulus cells are removed. Pronuclear embryos are then 
washed and stored until the time of injection. Randomly cycling adult female mice are 
paired with vasectomized males. Recipient females are mated at the same time as donor 
females. Embryos then are transferred surgically. The procedure for generating transgenic 
rats is similar to that of mice (Hammer et aL, Cell 63:1099-1 1 12, 1990). 

[0431] Methods for the culturing of embryonic stem (ES) cells and the subsequent 
production of transgenic animals by the introduction of DNA into ES cells using methods 
such as electroporation, calcium phosphate/DNA precipitation and direct injection also are 
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well known to those of ordinary skill in the art (Teratocarcinomas and Embryonic Stem 
Cells, A Practical Approach, EJ. Robertson, ed., ERL Press, 1987). 

[0432] In cases involving random gene integration, a clone containing the 
sequence(s) of the invention is co-transfected with a gene encoding resistance. 
Alternatively, the gene encoding neomycin resistance is physically linked to the 
sequence(s) of the invention. Transfection and isolation of desired clones are carried out 
by any one of several methods well known to those of ordinary skill in the art (E.J. 
Robertson, supra). 

[0433] DNA molecules introduced into ES cells can also be integrated into the 
chromosome through the process of homologous recombination (Capecchi, Science 244: 
1288-1292, 1989). Methods for positive selection of the recombination event {i.e., neo 
resistance) and dual positive-negative selection {i.e., neo resistance and gancyclovir 
resistance) and the subsequent identification of the desired clones by PCR have been 
described by Capecchi, supra and Joyner et al. (Nature 338: 153-156, 1989), the teachings 
of which are incorporated herein in their entirety including any drawings. The final phase 
of the procedure is to inject targeted ES cells into blastocysts and to transfer the blastocysts 
into pseudopregnant females. The resulting chimeric animals are bred and the offspring 
are analyzed by Southern blotting to identify individuals that carry the transgene. 
Procedures for the production of non-rodent mammals and other animals have been 
discussed by others (Houdebine and Chourrout, supra; Pursel et al., Science 244:1281- 
1288, 1989; and Simms et al., Bio/Technology 6:179-183, 1988). 

[0434] Thus, the invention provides transgenic, nonhuman mammals containing a 
transgene encoding a kinase of the invention or a gene effecting the expression of the 
kinase. Such transgenic nonhuman mammals are particularly useful as an in vivo test 
system for studying the effects of introduction of a kinase, or regulating the expression of a 
kinase {i.e., through the introduction of additional genes, antisense nucleic acids, or 
ribozymes). 

[0435] A "transgenic animal" is an animal having cells that contain DNA which 
has been artificially inserted into a cell, which DNA becomes part of the genome of the 
animal which develops from that cell. Preferred transgenic animals are primates, mice, 
rats, cows, pigs, horses, goats, sheep, dogs and cats. The transgenic DNA may encode 
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human STE20-related kinases. Native expression in an animal may be reduced by 
providing an amount of anti-sense RNA or DNA effective to reduce expression of the 
receptor. 

IX . Gene Therapy 

[0436] STE20-related kinases or their genetic sequences will also be useful in gene 
therapy (reviewed in Miller, Nature 357:455-460, 1992). Miller states that advances have 
resulted in practical approaches to human gene therapy that have demonstrated positive 
initial results. The basic science of gene therapy is described in Mulligan (Science 
260:926-931, 1993). 

[0437] In one preferred embodiment, an expression vector containing STE20- 
related kinase coding sequence is inserted into cells, the cells are grown in vitro and then 
infused in large numbers into patients. In another preferred embodiment, a DNA segment 
containing a promoter of choice (for example a strong promoter) is transferred into cells 
containing an endogenous gene encoding kinases of the invention in such a manner that 
the promoter segment enhances expression of the endogenous kinase gene (for example, 
the promoter segment is transferred to the cell such that it becomes directly linked to the 
endogenous kinase gene). 

[0438] The gene therapy may involve the use of an adenovirus containing kinase 
cDNA targeted to a tumor, systemic kinase increase by implantation of engineered cells, 
injection with kinase-encoding virus, or injection of naked kinase DNA into appropriate 
tissues. 

[0439] Target cell populations may be modified by introducing altered forms of 
one or more components of the protein complexes in order to modulate the activity of such 
complexes. For example, by reducing or inhibiting a complex component activity within 
target cells, an abnormal signal transduction event(s) leading to a condition may be 
decreased, inhibited, or reversed. Deletion or missense mutants of a component, that 
retain the ability to interact with other components of the protein complexes but cannot 
function in signal transduction may be used to inhibit an abnormal, deleterious signal 
transduction event. 

[0440] Expression vectors derived from viruses such as retroviruses, vaccinia 
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vims, adenovirus, adeno-associated virus, herpes viruses, several RNA viruses, or bovine 
papilloma virus, may be used for delivery of nucleotide sequences (e.g., cDNA) encoding 
recombinant kinase of the invention protein into the targeted cell population (e.g., tumor 
cells). Methods which are well known to those skilled in the art can be used to construct 
recombinant viral vectors containing coding sequences (Maniatis et al., Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, N.Y., 1989; Ausubel 
et al., Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley 
Interscience, N.Y., 1989). Alternatively, recombinant nucleic acid molecules encoding 
protein sequences can be used as naked DNA or in a reconstituted system e.g., liposomes 
or other lipid systems for delivery to target cells (e.g., Feigner et al., Nature 337:387-8, 
1989). Several other methods for the direct transfer of plasmid DNA into cells exist for 
use in human gene therapy and involve targeting the DNA to receptors on cells by 
complexing the plasmid DNA to proteins (Miller, supra). 

[0441] In its simplest form, gene transfer can be performed by simply injecting 
minute amounts of DNA into the nucleus of a cell, through a process of microinjection 
(Capecchi, Cell 22:479-88, 1980). Once recombinant genes are introduced into a cell, 
they can be recognized by the cell's normal mechanisms for transcription and translation, 
and a gene product will be expressed. Other methods have also been attempted for 
introducing DNA into larger numbers of cells. These methods include: transfection, 
wherein DNA is precipitated with CaP04 and taken into cells by pinocytosis (Chen et al., 
Mol. Cell Biol. 7:2745-52, 1987); electroporation, wherein cells are exposed to large 
voltage pulses to introduce holes into the membrane (Chu et al., Nucleic Acids Res. 
15:131 1-26, 1987); lipofection/liposome fusion, wherein DNA is packaged into lipophilic 
vesicles which fuse with a target cell (Feigner et al., Proc. Natl. Acad. Sci. USA. 84:7413- 
7417, 1987); and particle bombardment using DNA bound to small projectiles (Yang et 
al, Proc. Natl. Acad. Sci. 87:9568-9572, 1990). Another method for introducing DNA 
into cells is to couple the DNA to chemically modified proteins. 

[0442] It has also been shown that adenovirus proteins are capable of destabilizing 
endosomes and enhancing the uptake of DNA into cells. The admixture of adenovirus to 
solutions containing DNA complexes, or the binding of DNA to polylysine covalently 
attached to adenovirus using protein crosslinking agents substantially improves the uptake 
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and expression of the recombinant gene (Curiel et al. 9 Am. J. Respir. Cell. Mol. Biol., 
6:247-52, 1992). 

[0443] As used herein "gene transfer" means the process of introducing a foreign 
nucleic acid molecule into a cell. Gene transfer is commonly performed to enable the 
expression of a particular product encoded by the gene. The product may include a 
protein, polypeptide, anti-sense DNA or RNA, or enzymatically active RNA. Gene 
transfer can be performed in cultured cells or by direct administration into animals. 
Generally gene transfer involves the process of nucleic acid contact with a target cell by 
non-specific or receptor mediated interactions, uptake of nucleic acid into the cell through 
the membrane or by endocytosis, and release of nucleic acid into the cytoplasm from the 
plasma membrane or endosome. Expression may require, in addition, movement of the 
nucleic acid into the nucleus of the cell and binding to appropriate nuclear factors for 
transcription. 

[0444] As used herein "gene therapy" is a form of gene transfer and is included 
within the definition of gene transfer as used herein and specifically refers to gene transfer 
to express a therapeutic product from a cell in vivo or in vitro. Gene transfer can be 
performed ex vivo on cells which are then transplanted into a patient, or can be performed 
by direct administration of the nucleic acid or nucleic acid-protein complex into the 
patient. 

[0445] In another preferred embodiment, a vector having nucleic acid sequences 
encoding a STE20-related kinase polypeptide is provided in which the nucleic acid 
sequence is expressed only in specific tissue. Methods of achieving tissue-specific gene 
expression are set forth in International Publication No. WO 93/09236, filed November 3, 
1992 and published May 13, 1993. 

[0446] In all of the preceding vectors set forth above, a further aspect of the 
invention is that the nucleic acid sequence contained in the vector may include additions, 
deletions or modifications to some or all of the sequence of the nucleic acid, as defined 
above. 

[0447] In another preferred embodiment, a method of gene replacement is set forth. 
"Gene replacement" as used herein means supplying a nucleic acid sequence which is 
capable of being expressed in vivo in an animal and thereby providing or augmenting the 
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function of an endogenous gene which is missing or defective in the animal. 

X. Administration of Substances 

[0448] Methods of determining the dosages of compounds to be administered to a 
patient and modes of administering compounds to an organism are disclosed in U.S. 
Application Serial No. 08/702,282, filed August 23, 1996 and International patent 
publication number WO 96/22976, published August 1 1996, both of which are 
incorporated herein by reference in their entirety, including any drawings, figures, or 
tables. Those skilled in the art will appreciate that such descriptions are applicable to the 
present invention and can be easily adapted to it. 

[0449] The proper dosage depends on various factors such as the type of disease 
being treated, the particular composition being used, and the size and physiological 
condition of the patient. Therapeutically effective doses for the compounds described 
herein can be estimated initially from cell culture and animal models. For example, a dose 
can be formulated in animal models to achieve a circulating concentration range that 
initially takes into account the IC 50 as determined in cell culture assays. The animal model 
data can be used to more accurately determine useful doses in humans. 

[0450] Plasma half-life and biodistribution of the drug and metabolites in the 
plasma, tumors, and major organs can be also be determined to facilitate the selection of 
drugs most appropriate to inhibit a disorder. Such measurements can be carried out. For 
example, HPLC analysis can be performed on the plasma of animals treated with the drug 
and the location of radiolabeled compounds can be determined using detection methods 
such as X-ray, CAT scan, and MRI. Compounds that show potent inhibitory activity in the 
screening assays, but have poor pharmacokinetic characteristics, can be optimized by 
altering the chemical structure and retesting. In this regard, compounds displaying good 
pharmacokinetic characteristics can be used as a model. 

[0451] Toxicity studies can also be carried out by measuring the blood cell 
composition. For example, toxicity studies can be carried out in a suitable animal model 
as follows: 1) the compound is administered to mice (an untreated control mouse should 
also be used); 2) blood samples are periodically obtained via the tail vein from one mouse 
in each treatment group; and 3) the samples are analyzed for red and white blood cell 
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counts, blood cell composition, and the percent of lymphocytes versus polymorphonuclear 
cells. A comparison of results for each dosing regime with the controls indicates if 
toxicity is present. 

[0452] At the termination of each toxicity study, further studies can be carried out 
by sacrificing the animals (preferably, in accordance with the American Veterinary 
Medical Association guidelines Report of the American Veterinary Medical Assoc. Panel 
on Euthanasia, Journal of American Veterinary Medical Assoc, 202:229-249, 1993). 
Representative animals from each treatment group can then be examined by gross 
necropsy for immediate evidence of metastasis, unusual illness, or toxicity. Gross 
abnormalities in tissue are noted, and tissues are examined histologically. Compounds 
causing a reduction in body weight or blood components are less preferred, as are 
compounds having an adverse effect on major organs. In general, the greater the adverse 
effect the less preferred the compound. 

[0453] For the treatment of cancers the expected daily dose of a hydrophobic 
pharmaceutical agent is between 1 to 500 mg/day, preferably 1 to 250 mg/day, and most 
preferably 1 to 50 mg/day. Drugs can be delivered less frequently provided plasma levels 
of the active moiety are sufficient to maintain therapeutic effectiveness. 

[0454] Plasma levels should reflect the potency of the drug. Generally, the more 
potent the compound the lower the plasma levels necessary to achieve efficacy. 

EXAMPLES 

[0455] The examples below are not limiting and are merely representative of 
various aspects and features of the present invention. The examples below demonstrate 
the isolation and characterization of the STE20-related kinases of the invention. 

EXAMPLE 1 isolation of cDNAs Encoding Mammalian STE20-related Protein Kinases 
Materials and Methods 
Identification of novel clones 

[0456] Total RNAs were isolated using the Guanidine Salts/Phenol extraction 
protocol of Chomczynski and Sacchi (P. Chomczynski and N. Sacchi, Anal. Biochem. 
162, 156 (1987)) from primary human tumors, normal and tumor cell lines, normal human 
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tissues, and sorted human hematopoietic cells. These RNAs were used to generate single- 
stranded cDNA using the Superscript Preamplification System (GIBCO BRL, 
Gaithersburg, MD; Gerard, GF et al. (1989), FOCUS 11, 66) under conditions 
recommended by the manufacturer. A typical reaction used 10 |ng total RNA with 1.5 jig 
oligo(dT)i2-i8 in a reaction volume of 60 ^iL. The product was treated with RNaseH and 
diluted to 100 \xL with H 2 0. For subsequent PCR amplification, 1-4 jxL of this sscDNA 
was used in each reaction. 

[0457] Degenerate oligonucleotides were synthesized on an Applied Biosystems 
3948 DNA synthesizer using established phosphoramidite chemistry, precipitated with 
ethanol and used unpurified for PCR. The sequence of some of the degenerate 
oligonucleotide primers and the amino acid motif they encode is as follows: 

TRK1 5 ' -CTGAATTCGGNGCNTTYGGNAARGT-3 '(SEQ ID NO:32) 

GAFGKV (sense)(SEQ ID NO:37) 
TRK4 5 '-GCTGGATCCYTCNGGNGGCATCCA-3 '(SEQ ID NO:33) 

WMPPE (antisense)(SEQ ID NO:38) 
ROS1 5 ' -GCNTTYGGNGARGTNT A YG ARGG-3 ' (SEQ ID NO:34) 

AFGEVYEG (sense)(SEQ ID NO:39) 
CCK4b 5'-GCTGGATCCYTCNGGNSWCATCCA-3'(SEQ ID NO:35) 

WMSPE (antisense)(SEQ ID NO:40) 
CCK4c 5 '-GAGTTYGGNGARGTNTTYYTNGC-3 '(SEQ ED NO:36) 
EFGEVYEG (sense)(SEQ ED NO:41) 

[0458] These primers were derived from the sense and antisense strands of 
conserved motifs within the catalytic domain of several protein kinases. Degenerate 
nucleotide residue designations are: N = A, C, G, or T; R = A or G; Y = C or T; H = A, C 
or T not G; D = A, G or T not C; S = C or G; and W = A or T. 

[0459] PCR reactions were performed using degenerate primers applied to multiple 
single-stranded cDNAs. The primers were added at a final concentration of 5 ^iM each to 
a mixture containing 10 mM TrisHCl, pH 8.3, 50 mM KC1, 1.5 mM MgCl 2 , 200 each 
deoxynucleoside triphosphate, 0.001% gelatin, 1.5 U AmpliTaq DNA Polymerase (Perkin- 
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Elmer/Cetus), and 1-4 cDNA. Following 3 min denaturation at 95 °C, the cycling 
conditions were 94 °C for 30 s, 50 °C for 1 min, and 72 °C for 1 min 45 s for 35 cycles. 
PCR fragments migrating between 300-350 bp were isolated from 2% agarose gels using 
the GeneClean Kit (Biol 01), and T-A cloned into the pCRU vector (Invitrogen Corp. 
U.S.A.) according to the manufacturer's protocol. 

[0460] Colonies were selected for mini plasmid DNA-preparations using Qiagen 
columns and the plasmid DNA was sequenced using a cycle sequencing dye-terminator kit 
with AmpliTaq DNA Polymerase, FS {ABI, Foster City, CA). Sequencing reaction 
products were run on an ABI Prism 377 DNA Sequencer, and analyzed using the BLAST 
alignment algorithm (Altschul, S.F. et aL, J.Mol.Biol. 215: 403-10). 

[0461] Additional PCR strategies were employed to connect various PCR 
fragments or ESTs using exact or near exact oligonucleotide primers as detailed in the 
results section for each cDNA. PCR conditions were as described above except the 
annealing temperatures were calculated for each oligo pair using the formula: Tm = 
4(G+C)+2(A+T). 

Isolation of cDNA clones: 

[0462] Human cDNA libraries were probed with PCR or EST fragments 
corresponding to STE20-related genes. Probes were 32 P-labeled by random priming and 
used at 2xl0 6 cpm/mL following standard techniques for library screening. Pre- 
hybridization (3 h) and hybridization (overnight) were conducted at 42 °C in 5X SSC, 5X 
Denhart's solution, 2.5% dextran sulfate, 50 mM Na 2 P04/NaHP0 4 , pH 7.0, 50% 
formamide with 100 mg/mL denatured salmon sperm DNA. Stringent washes were 
performed at 65 °C in 0.1X SSC and 0.1% SDS. DNA sequencing was carried out on both 
strands using a cycle sequencing dye-terminator kit with AmpliTaq DNA Polymerase, FS 
(ABI, Foster City, CA). Sequencing reaction products were run on an ABI Prism 377 
DNA Sequencer. 

Makegene Bioinformatics EST assembler 

[0463] The EST reports were downloaded from National Institute for 
Biotechnology Information. After uncompressing the files, the program < report2est' was 
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scripted to extract the following information: 1) EST names, 2) GenBank Accession 
numbers, 3) GenBank gi numbers, 4) Clone Id numbers, 5) the nucleotide sequences of the 
ESTs 6) the organism, 7) the library name, 8) the name of the lab, and 9) the institution. 
The output of c report2est' is a file in FASTA format with all of the information listed 
above in the first line of each entry except the sequence, which is listed in the second line 
of each entry. The resulting file is formatted for BLAST using 'pressdb' (available as part 
of the ncbi tool kit). 

[0464] To build a gene or part of a gene from ESTs, the program 'makegene' was 
developed. Input to this program is a query sequence and the organism/species for which a 
gene is to be built. An initial search of the formatted EST database described above is 
performed using BLAST (blastn). Any results that contain warnings, such as polyA tails 
or other repeat elements, are eliminated from future queries. The program 
'blast_parse_reports' was developed to extract the FASTA header line from the search 
results and the output is then filtered to extract only FASTA header lines for the desired 
species. 

[0465] The initial results, having been filtered for warnings and species, go into a 
loop in which searches against the database are repeated until no new ESTs are found. 
The loop consists of the following steps: 1) when possible the names of both ends of the 
ESTs are extracted from the database by searching using the 'Clone Id' field or the part of 
the ' EST name ' field before the .r or .s postscript, 2) any ESTs that have been used as 
queries in previous loops are removed from the current query by the program 'subtract', 3) 
the resulting list of ESTs is used to extract the sequences from the database by the program 
batch_parse_fasta, 4) BLAST is run against the database using each sequence, 5) the 
output files from BLAST containing warnings are removed, 6) the results are filtered by 
species, and 7) the loop is reentered if there were new ESTs found in the previous pass 
through the loop. 

[0466] The ESTs chosen by 'makegene' are used as input for the program 
'mpd2_cluster' (Hide, W., Burke, J, and Davison, D. U. of Houston, unpublished) which 
clusters overlapping sequences. The programs 'contig' (Kerlavage, T., TIGR, 
unpublished), 'gde2mult' and 'gde2sing' (Smith, S.W., et al. y CABIOS 10, 671-675 
(1994)), are used to make an alignment and consensus sequence of the overlapping ESTs. 
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RESULTS 

cDNA cloning and characterization of STLK2 

[0467] The human STLK2 cDNA sequence is composed of two overlapping EST 
fragments, AA191319 and W16504, that were identified using a Smith- Waterman search 
of the EST database with STLK1 (MST3 GB:AF024636) as a query. The complete 
sequence of both clones was determined and used to generate the full-length human STL2 
sequence. 

[0468] EST clone AA191319 contains a 1327 bp insert and an ORF of 1146 bp 
(382 amino acids). EST clone W 16504 contains a 2474 bp insert (not including the poly- 
A tail) and an ORF of 687 bp (382 amino acids). 

[0469] The full-length human STLK2 cDNA (SEQ ID NO. 1) is 3268 bp long. 
AA191319 spans positions 1-1327 and W16504 positions 743-3216. The overlap between 
these two clones exhibits 100% sequence identity. The human STLK2 cDNA constains a 
1248 bp ORF flanked by a 181 bp 5' UTR (1-181) and a 1784 bp 3' UTR (1433-3216) that 
is followed by a 52 nucleotide polyadenylated region. A polyadenylation signal 
(AATAAA) is found at positions 3193-3198. The sequence flanking the first ATG 
conforms to the Kozak consensus for an initiating methionine, and is believed to be the 
translational start site for STLK2. Furthermore, human STLK2, and the related SOK-1 
and MST3 proteins, conserve the amino acid sequence immediately following this 
presumed initiating methionine. 

[0470] Several EST fragments span the complete STLK2 sequence with 
AA191319 at the 5' end and W16504 at the 3' end. 

[0471] All searches against the public nucleic acid database (NRN) and protein 
database (NRP) were conducted using the Smith-Waterman gap alignment program 
((Smith, TF and Waterman, MS (1981) J. Mol. Biol, 147, 195-197).) with the PAM100 
matrix and gap open and extension penalties of 14:1, respectively. 

cDNA Cloning and Characterization of STLK3 

[0472] A mammalian STLK3 clone, 135-31-19, was first identified from a PCR 
screen with the degenerate oligos, TRK1 and TRK4, applied to a sscDNA generated from 
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adult rat brain substantia nigra. Sequence analysis of the 457 bp insert indicated that it 
represented a novel member of the STE20-subfamily of STKs. 

[0473] A Smith-Waterman search of the EST database with the rat STLK3 
fragment and human STLK1 (MST3 GB:AF024636) as queries identified several 
overlapping ESTs spanning most of the human STLK3 cDNA sequence. A Makegene 
analysis generated a 3037 bp contig from approximately 44 EST sequences. Since the 3' 
ESTs were not commercially available, a pair of primers (5'- 
CACAGAAACGGTCAGATTCAC-3 '(SEQ ID NO: 42) and 5'- 

GATCAGGGTGACATCAAGGGAC-3'(SEQ ID NO: 43)) were derived from this region 
to generate PCR clone 3R2 1-20-6 from human fetal liver sscDNA. This clone and EST 
AA278967 were fully sequenced to generate the full-length STLK2 cDNA sequence. 

[0474] AA278967 is a 837 bp EST isolated by the IMAGE consortium from cDNA 
made from CD20+/IgD- germinal center B cells sorted from human tonsillar cells. 

[0475] PCR clone 3R2 1-20-6 was isolated from human fetal sscDNA and contains 
a 1 1 16 bp insert, including a 1086 bp ORF encoding the 362 C-terminal amino acids of 
STLK3. 

[0476] The full-length human STLK3 cDNA (SEQ ID NO. 2) is 3030 bp long. 
AA278967 spans positions 1-814 and 3R2 1-20-6 spans positions 464-1579. The overlap 
between these two clones exhibits 100% sequence identity. The remaining 1452 bp of 3' 
UTR is derived from an assembly of multiple unconfirmed EST fragments. 

[0477] The near full-length human STLK3 cDNA (SEQ ID NO.2) is 3030 bp long 
and consists of a 1548 bp ORF flanked by a 1476 bp 3' UTR (1550-3025) and a 5 
nucleotide polyadenylated region. A polyadenylation signal (AATAAA) begins at position 
3004. Since the coding region is open throughout the 5' extent of this sequence, this is 
apparently a partial cDNA clone lacking the N-terminal start methionine. Six copies of a 
"GGCCCC" repeat were observed in positions 21-67. Five independent ESTs 
(AA150838, AA286879, AA251679, AA252004, AA278967) showed the same repeat, 
suggesting that this sequence may be an integral region of the human STLK3 gene. 
Stronger evidence for this being the case is provided by the sequence of the murine 
orthologue of STLK3 represented by a 876 bp EST W20737. 

[0478] Multiple EST fragments span the complete STLK3 sequence with 
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AA278967 at the 5' end and AA628477 and others at the 3' end. 

cDNA Cloning and Characterization of STLK4 

[0479] The human STLK4 cDNA sequence is composed of two overlapping EST 
fragments, AA297759 and AA1 00484, that were identified using a Smith- Waterman 
search of the EST database with STLK1 (MST3 GB:AF024636) as a query. The complete 
sequence of both clones was determined and used to generate the near full-length human 
STLK4 sequence. 

[0480] AA1 00484 is an IMAGE consortium cDNA clone isolated from the T-84 
colonic epithelium cell line. It has an insert of 3694 bp and a coding region of 1 146 bp 
(382 amino acids). A Smith- Waterman sequence alignment against the NRN database 
showed this EST to be 71.4% identical to the human STE20-like kinase (GB:X99325). 

[0481] W16504 is an IMAGE consortium clone isolated from a human fetal heart 
cDNA library. It has an insert length of 2474 bp (not including the poly-A tail) and a 
coding region of 687 bp (229 amino acids). A Smith-Waterman sequence alignment of 
W16504 against the NRN database showed this EST to be 69.2% identical to the human 
STE20-like kinase (GB:X99325). 

[0482] The full-length human STLK2 cDNA (SEQ ID NO. 1) is 3268 bp long. 
AA191319 spans positions 1-1327, and W16504 positions 743-3216. The overlap 
between these two clones is 585 bp long with 100% sequence identity. 

[0483] AA100484 is an IMAGE consortium cDNA clone isolated from the T-84 
colonic epithelium cell line. AA1 00484 covers the bulk of Human STLK4 with its 3694 
bp, which spans positions 146-3839 of SEQ ID NO:3. A second EST, AA297759, isolated 
from a Jurkat T cell cDNA library, spans positions 1-271 of the human STLK4 contig. 
The two ESTs overlap over a 126 bp stretch that has only one nucleotide discrepancy at 
position 149 (G in AA297759 and T in AA 100484). A T at this position was chosen for 
the SEQ ID NO: 3 based on sequence data generated from A 100484. The 5 5 145 bp of 
human STLK4 contains three sequencing ambiguities (N's in SEQ ED NO:3) arising from 
sequence errors in the GenBank entry for AA297759. Three amino acid sequence 
ambiguities in the N-terminus of human STLK4 are present also in SEQ ID NO:7 as a 
consequence of the sequence inaccuracies from the EST entry. 
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[0484] The coding region of human STLK4 is 1242 bp long (2-1243), capable of 
encoding a 414 amino acid polypeptide, and is followed by a 2596 nucleotide 3' UTR 
(1244-3839). Human STLK4 ends in a polyadenylated stretch that has 18 adenines (3840- 
3857). A polyadenylation signal (AATAAA) is found between positions 3822-3827. 
Targeted-PCR cloning identified one rat orthologue of human STLK4, clone 135-31-19. 
In addition, one murine orthologue of human STLK4 was recognized in the EST database 
as AA1 17483. None of these orthologues add additional N-terminal sequence to the 
human STLK4. 

[0485] The near full-length human STLK4 cDNA (SEQ ID NO.3) is 3857 bp long 
and consists of a 1242 bp ORF flanked by a 2596 bp 3' UTR (1244-3839) and an 18 
nucleotide polyadenylated region. Polyadenylation signals (AATAAA) begin at positions 
2181 and 3822. Since the coding region is open throughout the 5' extent of this sequence, 
this is apparently a partial cDNA clone lacking the N-terminal start methionine. A near 
full-length murine STLK4 cDNA is represented in the 1773 bp EST AA1 17438. It 
extends an additional 21 nucleotides 5' of the human STLK4 consensus, but since its 
coding region is open throughout the 5' extent of the sequence, this is also probably a 
partial cDNA clone lacking the N-terminal start methionine. 

[0486] Several EST fragments span the complete STLK3 sequence with 
AA297759 at the 5' end and AA100484 and others at the 3' end. 

cDNA Cloning and Characterization of STLK5 

[0487] The human STLK5 cDNA sequence is composed of four overlapping 
sequences, AI418298, 2R96-13-1, 3R25-45-3 and R46685. A human STLK5 clone, 
F07734, was first identified using a Smith- Waterman search of the EST database with 
SPS_sc (U33057) as a query. 

[0488] AI418298 is an IMAGE consortium cDNA clone with an 895 bp insert. 

[0489] PCR clone 2R96-13-1 was isolated from human brain sscDNA using 
primers 5 '-CTCATCTGTACACACTTCATGG(SEQ ID NO:44) and 5'- 
GATTCCCACACTGTAGATGTC(SEQ ID NO:45) derived from F07734. 2R96-13-1 
contains a 330 bp insert and an ORF of 330 bp (1 10 amino acids). 

[0490] EST clone R46685 was identified using a Smith- Waterman search of the 



002.1115275.1 



Attorney Docket No. 034536/1034 

EST database with the C-terminus of SPS_sc (GB:U33057) as query. Sequence analysis 
of the 1047 bp insert identified this EST to contain an ORF of 285 bp (95 amino acids) 
encoding the C-terminus of human STLK5. 

[0491] PCR clone 3R25-45-3 was isolated from human fetal brain sscDNA using 
primers 5'- GGCCCTCGACTACATCCACCACAT(SEQ ID NO:46) and 5'- 
CAACGAAACTAACACAGCATAAGG(SEQ ID NO:47) derived from 2R96-13-1 and 
R46685, respectively. 3R25-45-3 contains a 330 bp insert and an ORF of 750 bp (250 
amino acids). 

[0492] The full-length human STLK5 cDNA (SEQ ID NO:96) is 2110 bp long and 
consists of a 1 1 19 bp ORF flanked by a 229 bp 5' UTR and a 762 bp 3' UTR. The 
sequence flanking the first ATG conforms to the Kozak consensus {supra) for an initiating 
methionine, and is believed to be the translational start site for STLK5. 

[0493] Several EST fragments span the complete STLK5 sequence with 
AA297059 and F07734 at the 5' end and R46686 and F03423 and others at the 3' end. 

[0494] STLK5 displays a 100% match over a 41 bp stretch (position 2-42, SEQ ID 
NO. 97) to a human CpG island repeat (Z61277). 

cDNA Cloning and Characterization of STLK6 

[0495] Human STLK6 was first identified in the translated EST database 
(AA2 19667) as a novel serine threonine kinase. 

[0496] The partial human STLK6 cDNA (SEQ ID NO:98) is 2,001 bp long and 
consists of a 1,254 bp ORF flanked by a 75 bp 5' UTR and a 673 bp 3' UTR. The 
sequence flanking the first ATG conforms to the Kozak consensus (Kozak, M., Nucleic 
Acids Res. 15, 8125-8148 (1987)) for an initiating methionine, and is believed to be the 
translational start site for STLK6. 

[0497] At the time of filing, inventors believe that STLK6 does not have any 
significant match in the nucleic acid database. 

cDNA Cloning and Characterization of STLK7 

[0498] Human STLK7 was first identified in the translated EST database 
(AA988954) as a novel serine threonine kinase. The original clone was not available 
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through public sources, so a PCR fragment amplified from the sequence of AA988954 
yielded 5R54-21-2. 

[0499] The partial human STLK7 cDNA (SEQ ID NO: 100) is 31 1 bp long and 
consists of a 309 bp ORF. Since the coding region is open throughout the 5' and 3' extent 
of this sequence, this appears to be a partial cDNA clone lacking the N-terminal start 
methionine and C-terminal stop codon. 

[0500] STLK7 shares 80% sequence identity to human SPAK (AF099989) over a 
167 bp region and 50% nucleotide sequence identity to SLTK7 (SEQ ID NO. 101) over 
391 nucleotides. 

cDNA Cloning and Characterization of ZC1 

[0501] The human ZC1 cDNA sequence is composed of two overlapping PCR 
clones, 3R25-24-2 and R65-12-2. 

[0502] A human ZC1 clone, 125-33-5, was first identified from a PCR screen with 
degenerate oligos, TRK1 and TRK4, applied to sscDNA generated from human small 
airway epithelial cells (Clontech). Sequence analysis of the 503 bp insert identified a 501 
bp ORF (167 amino acids) with the potential to encode a novel human STK related to the 
C. elegans ZC504.4 gene product. 

[0503] PCR clone 3R25-24-2 was isolated from human SNB19 glioblastoma 
sscDNA using primers 5 '-ATGGCGAACGACTCTCCCGCGAA(SEQ ID NO:48) and 5'- 
ACACCAAAATCAACAAGTTTCACCTC(SEQ ID NO:49) derived from the N-terminus 
of a murine orthologue of ZC1 (NIK, GB:U88984) and the original human ZC1 clone 125- 
33-5, respectively. 3R25-24-2 contains a 527 bp insert and an ORF of 519 bp (173 amino 
acids). 

[0504] PCR clone R65-12-2 was isolated as follows: A Smith- Waterman search of 
the EST database with the C. elegans ZC504.4 gene (GB:Z50029) as a query identified a 
human EST (W81656) whose ORF is related to the C. elegans gene and terminates in an 
identical residue (Tip). A primer was designed 3' to this stop codon (5'- 
AGTTACAAGGAATTCCAAGTTCT(SEQ ID NO:50)) and used in a PCR reaction with 
a primer derived from the original human ZC1 clone 125-33-5 (5 5 - 
ATGAAGAGGAAGAAATCAAACTG(SEQ ID NO:51)) using sscDNA from human 
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SNB19 glioblastoma as a template. PCR clone R65-12-2 was identified and was found to 
contain a 361 1 bp insert with a 3534 bp ORF encoding the C -terminal portion of human 
ZC1 (1178 amino acids). 

[0505] The full-length human ZC1 cDNA (SEQ ID NO. 9) is 3798 bp long. Clone 
3R25-24-2 spans positions 1-527, and clone R65-12-2 spans positions 188-3798. The 
overlap between these two clones exhibits 100% sequence identity. The human ZC1 
contains a 3717 bp ORF (17-3723) flanked by a 6 bp 5' UTR and a 75 bp (3724-3798) 3' 
UTR. No polyadenylation signal (AATAAA) or polyadenylated region are present in the 
3 'UTR. The sequence flanking the first ATG conforms to the Kozak consensus for an 
initiating methionine, and is believed to be the translational start site for human ZC1. 

[0506] Multiple EST fragments (W81656) match the 3' end of the human ZC1 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

cDNA Cloning and Characterization of ZC2 

[0507] The human ZC2 cDNA sequence is composed of four overlapping PCR 
clones, G75-31-17, R65-24-6, 2R28-8-1, and R99-6-10. 

[0508] A human ZC2 clone, G75-31-17, was first identified from a PCR screen 
with degenerate oligos, ROS1 (5'-GCNTTYGGNGARGTNTAYGARGG(SEQ ID 
NO:34)) and CCK4b (5 '-GCTGGATCCYTCNGGNS WCATCCA(SEQ ID NO:35)), 
applied to sscDNA generated from the human HLT383 primary non-small cell lung cancer 
tissue. Sequence analysis of the 492 bp insert identified a 492 ORF (164 amino acids) with 
the potential to encode a novel human STK related to the C. elegans ZC504.4 gene 
product. 

[0509] PCR clone R99-6-10 was isolated as follows: A Smith- Waterman search of 
the EST database with C. elegans ZC504.4 gene (GB:Z50029) as a query identified two 
overlapping human EST fragments (AA1 15844 and R51245) whose ORFs were related to 
the C. elegans gene and terminate in an identical residue (Trp). A primer was designed 3' 
to the stop codon found in R51245 (5 '-AGATGGACTGTACTGGGAGG(SEQ ID 
NO:52)) and used in a PCR reaction with a primer derived from AA1 15844 (5'- 
ACTTTGTGCAGCTCTGTGGG(SEQ ID NO:53)) using human fetal brain sscDNA as a 
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template. PCR clone R99-6-10 was identified and was found to contain a 1095 bp insert 
with a 930 bp ORF encoding the C-terminal portion of human ZC2 (310 amino acids). 

[0510] PCR clone R65-24-6 was isolated from human HT29 colon cancer cell line 
sscDNA using primers 5'-AAGGTTATGGATGTCACAGGG(SEQ ID NO:54) and 5'- 
AGATGGACTGTACTGGGAGG(SEQ ID NO:52) derived from G75-31-17 and R51245, 
respectively. The 3' primer used in this PCR reaction misprimed between positions 1634- 
1653 of this gene leading to the formation of a truncated product. R65-24-6 contains a 
1593 bp insert and an ORF of 1593 bp (531 amino acids). 

[051 1] PCR clone 2R28-8-1 was isolated from human colon cancer cell line HT29 
sscDNA using primers 5 '-CTCAC AAGGTTGCCAACAGG(SEQ ID NO:55) and 5'- 
AGTCCCCACCAGAAGGTTTAC(SEQ ID NO: 56) derived from R65-24-6 and R99-6- 
10, respectively. 2R28-8-1 contains a 1538 bp insert and an ORF of 1536 bp (512 amino 
acids). 

[0512] The partial human ZC2 cDNA (SEQ ID NO. 10) is 4055 bp long. Clone 
G75-31-17 spans positions 1-492, clone R65-24-6 spans positions 58-1650, clone 2R28-8- 
1 spans positions 1466-3003 and clone R99-6-10 spans positions 2961-4055. The 
overlaping regions between these clones exhibit 100% sequence identity except for a 
single guanine (G75-31-17) to adenosine (R65-24-6) mismatch at position 280 resulting in 
a Glu to Lys change. Based on the presence of an acidic residue in this position in human 
ZC1 and ZC3 and C. elegans ZC504.4, the sequence encoding the Glu is probably correct. 
The human ZC2 gene contains a 3891 bp ORF (1-3891) flanked by 164 bp (3892-4055) 3' 
UTR. No polyadenylation signal (AATAAA) or polyadenylated region is present in the 
3'UTR. 

[0513] Multiple EST fragments (R51245) match the 3' end of the human ZC2 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

cDNA Cloning and Characterization of ZC3 

[0514] The human ZC3 cDNA sequence is composed of four overlapping PCR 
clones, G75-30-30, 3R33-5-3, 3R19-17-6, and R99-43-11. 

[0515] A human ZC3 clone, G75-30-30, was first identified from a PCR screen 
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with degenerate oligos, ROS1 and CCK4b, applied to sscDNA generated from a human 
HLT370 primary non-small cell lung cancer tissue. Sequence analysis of the 492 bp insert 
identified a 492 ORF (164 amino acids) with the potential to encode a novel human STK 
related to the C. elegans ZC504.4 gene product. 

[0516] PCR clone R99-43-1 1 was isolated as follows: A Smith- Waterman search 
of the EST database with the C. elegans ZC504.4 gene (GB:Z50029) as a query identified 
a human EST (R54563) whose ORF is related to the C. elegans gene and terminates in an 
identical residue (Tip). A primer was designed 3' to the stop codon found in R54563 (5'- 
TCAGGGGTCAGAGGTCACG(SEQ ID NO:57)) and used in a PCR reaction with a 
primer derived from the 5' end of R54563 (5'-CCCAAACCCTACCACAAATTC(SEQ ID 
NO:58)) using sscDNA from human fetal brain as a template. PCR clone R99-43-1 1 was 
identified and was found to contain a 719 bp insert with a 564 bp ORF encoding the C- 
terminal portion of human ZC3 (188 amino acids). 

[0517] PCR clone 3R19-17-6 was isolated from human A549 lung cancer cell line 
sscDNA using primers 5 '-CCCCCGGGAAACGATGACCA and 5'- 
AGCCGCTGCCCCTCCTCTACTGT derived from G75-30-30 and R99-43-1 1, 
respectively. The 3' primer used in this PCR reaction misprimed leading to the formation 
of a truncated product. 3R19-17-6 contains a 1 172 bp insert and an ORF of 1 170 bp (390 
amino acids). 

[0518] PCR clone 3R33-5-3 was isolated from human A549 lung cancer cell line 
sscDNA using primers 5'-ACCGCAACATCGCCACCTACTAC(SEQ ID NO:61) and 5'- 
CTCGACGTCGTGGACCACC(SEQ ID NO:62) derived from G75-30-30 and 3R19-17-6, 
respectively. 3R33-5-3 contains a 2465 bp insert and an ORF of 2463 bp (821 amino 
acids). 

[0519] The full-length human ZC3 cDNA (SEQ ID NO. 1 1) is 4133 bp long. 
Clone G75-30-30 spans positions 1-483, clone 3R33-5-3 spans positions 134-2598, clone 
3R19-17-6 spans positions 2356-3512 and clone R99-43-11 spans positions 3415-4133. 
The overlaps between these clones exhibit 100% sequence identity. The human ZC3 gene 
contains a 3978 bp ORF (1-3978) flanked by a 152 bp 3' UTR (3979-4133). No 
polyadenylation signal (AATAAA) or polyadenylated region is present in the 3 'UTR. 

[0520] Multiple EST fragments (R54563) match the 3 'end of the human ZC3 gene, 
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but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

cDNA Cloning and Characterization of ZC4 

[0521] The human ZC4 cDNA sequence, represented by PCR fragment 3R25-27-1, 
was first identified in the human genomic cosmid 82J1 1 (GB:Z833850) containing exon 
sequences that displayed strong homology to the ZC504.4 C. elegans gene. 

[0522] PCR clone 3R25-27-1 was isolated from human fetal liver sscDNA and 
primers 5'-CAATGTTAACCCACTCTATGTCTC(SEQ ID NO:63) and 5'- 
AGTTTGCCGATGTTTTTCTTTTC(SEQ ID NO:64) derived from a potential ORF 
(positions 25729-25852) from the 82J11 cosmid and from an EST (R98571) encoding the 
C-terminus of the human ZC4 gene, respectively. 

[0523] The partial human ZC4 cDNA (SEQ ID NO.12) is 1459 bp long and 
consists of a 1047 bp ORF (2-1048) flanked by a 41 1 bp (1049-1459) 3'UTR region. No 
polyadenylation signal (AATAAA) or polyadenylated region is present in the 3'UTR. 

[0524] The N-terminal coding sequence for ZC4_h was extended by building a 
contiguous DNA sequence of 233,137 bp containing Z83850 and four other sequences: 
cU84B10 and cU230B10 (from the Sanger Human Genome Sequencing Project) and 
Z97356 and Z69734 (available from the National Institute for Biotechnology Information. 
The position of each sequence in the contig is represented in the table below. 



Accession 


Length 


Start 


End 


CU84B10 


43273 


0 


43273 


Z97356 


21848 


43171 


65018 


Z69734 


37077 


63073 


100149 


CU230B10 


11841 


88416 


100256 


Z83850 


132981 


100156 


233137 



[0525] Sequences in ZC4 genomic contig. 

[0526] The 233,137 bp contig was analyzed for exons using the programs 
FGENES 1.5 and FGENESH, human gene structure prediction software available from 
The Sanger Centre. 

[0527] The resulting human ZC4 coding sequence (SEQ ED NO:104) is 3,681 bp 
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long (excluding the stop codon) and encodes for a STE20 kinase of 1227 amino acids. 

cDNA Cloning and Characterization of KHS2 

[0528] The human KHS2 cDNA sequence is composed of four overlapping clones, 
3R25-51-2, 3R16-34-2, 3R16-31-2, and T79916. 

[0529] A human KHS2 clone, AA250855, was first identified using a Smith- 
Waterman search of the EST database with KHS1 (GB:U77129) as a query. Sequence 
analysis of the 1 1 12 bp insert identified a 618 bp ORF (206 amino acids) with the potential 
to encode a novel STK related to the human KHS1 gene product. Using AA250855 as a 
query, a second EST (AA446022) was found whose sequence was shown to contain the 
initiator methionine for human KHS2 based on a comparison with KHS1 . 

[0530] PCR clone 3R25-51-2 was isolated from human testicular cancer sscDNA 
using primers 5 , -CCGCCATGAACCCCGGCTT(SEQ ID NO:65) and 5'- 
CGATTGCCAAAGACCGTGTCA(SEQ ED NO:66) derived from AA446022 and 
AA250855, respectively. 3R25-51-2 contains an 850 bp insert and an ORF of 849 bp (283 
amino acids). 

[0531] EST clone, T79916, was identified using a Smith- Waterman search of the 
EST database with the C-terminus of KHS1 (GB:U77129) as a query. Sequence analysis 
of the 2107 bp insert identified this EST to contain an ORF of 345 bp (1 15 amino acids 
disrupted by a single stop codon) encoding the C-terminus of human KHS2, followed by 
1762 bp3TJTR. 

[0532] PCR clone 3R1 6-34-2 was isolated from human testis sscDNA using 
primers 5 '-AGAAGTTGCAGCTGTTGAGAGGA(SEQ ID NO:67) and 5'- 
TATGGCCCGTGTAAGGATTTC(SEQ ED NO:68) derived from AA250885 and T79916, 
respectively. 3R1 6-34-2 contains an 1516 bp insert and an ORF of 1 128 bp (376 amino 
acids). 

[0533] PCR clone 3R16-31-2 was isolated from normal human colon sscDNA 
using primers 5'- GTGCCAGAAGTGTTGTGTTGTAA(SEQ ID NO:69) and 5'- 
TATGGCCCGTGTAAGGATTTC(SEQ ID NO:68) derived from EST T79916. 3R16-31- 
2 contains a 728 bp insert and an ORF of 669 bp (223 amino acids). This clone lacked the 
stop codon present within EST T79916 (postion 2662 in the KHS2 sequence). 

122 



Attorney Docket No. 034536/1034 



[0534] The full-length human KHS2 cDNA (SEQ ID NO. 17) is 4023 bp long. 
Clone 3R25-51-2 spans positions 1-855, clone AA250885 spans positions 336-923, clone 
3R16-34-2 spans positions 545-2061, and clone T79916 spans positions 1917-4023. The 
overlaping regions between these clones exhibit 100% sequence identity, except for 4 
nucleotide differences, two of which are silent, a third corrects the internal stop codon at 
position 2662, and the fourth at position 247 (T to C change) results in a Pro to Leu 
change. The human KHS2 cDNA contains a 2682 bp ORF (6-2687) flanked by a 5 bp (1- 
5) 5'UTR and a 1336 bp (2688-4023) 3' UTR. A potential polyadenylation signal 
(AATAAA) is found at positions 4008-4013. No polyadenylated region is present in the 
3 'UTR. The sequence flanking the first ATG is in a poor context for translational 
initiation, however, a 134 bp 5'UTR sequence from EST AA446022 did not reveal any 
additional ATG's and displayed two in-frame stop codons 5' to the putative start ATG for 
human KHS2. 

[0535] Multiple EST fragments match the 5 'end (AA446022) as well as the 3' end 
(R37625) of the human KHS2 gene. 

cDNA Cloning and Characterization of SULU1 

[0536] The human SULU1 cDNA sequence is composed of three overlapping 
clones, N40091, 2R90-1-1 and R90907. 

[0537] A human SULU1 clone, N40091, was first identified using a Smith- 
Waterman search of the EST database with the C. elegans SULU gene (GB: U32275) as a 
query. Sequence analysis of the 1321 bp insert identified a 906 bp ORF (302 amino acids) 
with the potential to encode a novel human STK related to the C. elegans SULU gene 
product. 

[0538] EST clone R90907 was first identified using a Smith- Waterman search of 
the EST database with the 3' end of the C. elegans SULU gene (GB: U32275) as a query. 
Sequence analysis of the 1647 bp insert identified a 578 bp ORF (192 amino acids) with 
the potential to encode the C-terminus of the human SULU1 gene product. 

[0539] PCR clone 2R90-1-1 was isolated from human HT29 colon cancer cell 
sscDNA using primers 5'- TATTGAATTGGCGGAACGGAAG(SEQ ID NO:70) and 5'- 
TTGTTTTGTGCTCATTCTTTGGAG(SEQ ID NO:71) derived from N40091 and 
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R90907, respectively. 2R90-1-1 contains a 1625 bp insert and an ORF of 1623 bp (541 
amino acids). 

[0540] The fixll-length human SULU1 cDNA (SEQ ID NO.19) is 4177 bp long 
Clone N40091 spans positions 1-1321, clone 2R90-1-1 spans positions 1048-2671, and 
clone R90907 spans positions 2531-4177. The overlaping regions between these clones 
exhibit 100% sequence identity. The human SULU1 cDNA contains a 2694 bp ORF (415- 
3108) flanked by a 414 bp (1-414) 5'UTR and a 1069 bp (3109-4177) 3' UTR followed by 
a 19 nucleotide polydenylated region. A potential polyadenylation signal (AATAAA) is 
found at positions 4164-4169. The sequence flanking the first ATG conforms to the 
Kozak consensus for an initiating methionine, and is believed to be the translational start 
site for human SULU1 . 

[0541] Multiple EST fragments match the 5'end (N27153) as well as the 3' end 
(R90908) of the human SULU1 gene. 

cDNA Cloning and Characterization of Murine SULU3 

[0542] The murine SULU3 cDNA sequence is represented by PCR fragment 
2R92-1-6. 

[0543] A murine SULU3 clone, G83-4-5, was first identified from a PCR screen 
with degenerate oligos, CCK4c and CCK4b, applied to sscDNA generated from murine 
day- 12 embryos. Sequence analysis of the 473 bp insert identified a 471 ORF (157 amino 
acids) with the potential to encode a novel human STK related to the C. elegans SULU 
gene (GB: U32275) product. The antisense strand of G83-4-5 is identical at the nucleic 
acid level to the 5'UTR of the murine etsl pro toonco genie transcription factor 
(GB:X53953). This homology is likely the result of a cloning artifact attached to the 5'- 
end of the database entry for murine etsl . 

[0544] PCR clone 3R19-17-6 was isolated from human A549 cell sscDNA using 
primers 5 ' -CCCCCGGGAAACG ATGACC A(SEQ ID NP:59) and 5'- 
AGCCGCTGCCCCTCCTCTACTGT(SEQ ID NO:60) derived from G75-30-30 and R99- 
43-1 1, respectively. The 3' primer used in this PCR reaction misprimed leading to the 
formation of a truncated product. 3R19-17-6 contains a 1 172 bp insert and an ORF of 
1 170 bp (390 amino acids). 
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[0545] PCR clone 2R92-1-6 was isolated from murine d8 embryo sscDNA using 
primers 5'-ACCGCAACATCGCCACCTACTAC(SEQ ID NO:61) and 5'- 
GATTGCTTTGTGCTCATTCTTTGG(SEQ ID NO:72) derived from the 5' UTR of the 
etsl gene and the human EST AA234623, respectively. The latter (shown herein) encodes 
the C-terminus of human SULU3. 2R92-1-6 contains a 2249 bp insert and an ORF of 
2244 bp (748 amino acids). 

[0546] The partial murine SULU3 cDNA (SEQ ID NO.21) is 2249 bp long and 
consists of a 2244 bp ORF (6-2249) flanked by a 5 bp (1-5) 5'UTR. The sequence 
flanking the first ATG conforms to the Kozak consensus for an initiating methionine, and 
is believed to be the trans lational start site for murine SULU3. 

[0547] One EST fragment (AA446022) matches the 3' end of the partial murine 
SULU3 gene, but at the time of filing, the inventors believe that none exist in GenBank or 
the EST database that match its 5' end. 

cDNA Cloning and Characterization of Human SULU3 

[0548] The human SULU3 cDNA sequence is composed of two overlapping 
clones, 2R90-22-1 and AA234623. 

[0549] A human SULU3 clone, AA234623, was first identified using a Smith- 
Waterman search of the EST database with the C elegans SULU gene (GB: U32275) as a 
query. Sequence analysis of the 2652 bp insert identified a 1 185 bp ORF (395 amino 
acids) with the potential to encode the C-terminus of a novel human STK related to the C 
elegans SULU gene product. 

[0550] PCR clone 2R90-22-1 was isolated from human SKMell28 melanoma cell 
line sscDNA using primers 5 f - TATTGAATTGGCGGAACGGAAG(SEQ ID NO:70) and 
5'- TTGTTCTAAGAGTGCCCTCCG(SEQ ID NO:73) derived from the murine SULU3 
2R92-1-6 clone and from AA234623, respectively. 2R92-1-6 contains a 1897 bp insert and 
an ORF of 1896 bp (632 amino acids). 

[0551] The partial human SULU3 cDNA (SEQ ID NO.20) is 3824 bp long. Clone 
2R90-22-1 spans positions 1-1897 and clone AA234623 spans positions 1173. The 
overlaping region between these clones exhibits 100% sequence identity. The human 
SULU3 cDNA contains a 2358 bp ORF (2-2359) flanked by a 1465 bp (2360-3824) 
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3'UTR followed by a 19 nucleotide polydenylated region. A potential polyadenylation 
signal (AATAAA) is found at positions 2602-2607. Since the coding region is open 
throughout the 5' extent of this sequence, this is apparently a partial cDNA clone lacking 
the N-terminal start methionine. 

[0552] Multiple EST fragments (R02283) match the 3'end of the human SULU3 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 
database that match its 5' end. 

cDNA Cloning and Characterization of GEK2 

[0553] The human GEK2 cDNA sequence is composed of three overlapping clones, 
AA459448, 3R25-48-1 and GEK2_h#3. 

[0554] A human GEK2 clone, AA459448, was first identified using a Smith- 
Waterman search of the EST database with the human SLK gene (GB: AB002804) as a 
query. Sequence analysis of the 1286 bp insert identified a 1227 bp ORF (409 amino 
acids) with the potential to encode the N-terminus of a novel human STK related to the 
human SLK gene product. An additional Smith- Waterman search using the C-terminus of 
the SLK gene as a query yielded three additional EST's, AA323687, AA380492 and 
AA168869, that encode the C-terminal region of human GEK2. 

[0555] PCR clone 2R98-41-17 was isolated from human testis sscDNA using 
primers 5'- AAGACCATGCCGTGCGCCG(SEQ ID NO:74) and 5'- 
ATTCCTTCAGGTTCTGGTTATGG(SEQ ID NO:75) derived from AA323687 and from 
AA380492, respectively. 2R98-41-17 contains a 851 bp insert and an ORF of 849 bp (283 
amino acids). 

[0556] PCR clone GEK2_h#3 was isolated from human sscDNA made from the H23 
tumor cell line using primers 5'- GCAGCAAGTGGAGAAGATGG(SEQ ID NO: 109) and 
5'- GGAAGCATCCCCAGAGCTGTAG(SEQ ID NO:l 10) derived from the sequence of 
clone 3R25-48-1 and from the 3' end of murine LOK (GB:D89728), respectively. 
GEK2_h#3 contains a 1042bp insert and an ORF of 1041 bp (347 amino acids). 
[0557] The full-length human GEK2 cDNA (SEQ ID NO: 106) is 2962 bp long. Clone 
AA459448 spans positions 1-1286, clone 3R25-48-1 spans positions 1 100-2449 and clone 
GEK2_h#3 spans positions 1920-2962. The overlapping regions between these clones 
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exhibit 100% sequence identity. 

[0558] The human GEK2 cDNA contains a 2904 bp ORF (59-2962) flanked by a 58 bp 
(1-58) 5'UTR. The sequence flanking the first ATG conforms to the Kozak consensus for 
an initiating methionine, and is believed to be the translational start site for human GEK2. 

[0559] Multiple EST fragments (AA465671) match the 5'end of the sequence, but 
only one (AA3 80492) matches the 3' end of the human GEK2 gene. 

cDNA Cloninfi and Characterization of PAK4 

[0560] The human PAK4 cDNA sequence is represented by clone SNB2#1 . 

[0561] A human PAK4 clone, R88460, was first identified using a Smith- 
Waterman search of the EST database with the human PAK gene (GB: U24152) as a 
query. Sequence analysis of the 2332 bp insert identified a 930 bp ORF (310 amino acids) 
with the potential to encode the C-terminus of a novel human STK related to the human 
PAK gene product. 

[0562] cDNA clone SNB2#1 was isolated from human glioblastoma cell line 
SNB75 cDNA library using a probe derived from R88460. SNB2#1 contains a 3604 bp 
insert and an ORF of 2043 bp (681 amino acids). 

[0563] The full-length human PAK4 cDNA (SEQ ID NO.27) is 3604 bp long and 
consists of a 2043 bp ORF (143-2185) flanked by a 142 bp (1-142) 5'UTR and a 1419 3' 
UTR followed by a 22 nucleotide polydenylated region. A potential polyadenylation 
signal (AATTAAA) is found at positions 3582-3588. The sequence flanking the first 
ATG conforms to the Kozak consensus for an initiating methionine, and is believed to be 
the translational start site for human PAK4. The 3' UTR of the PAK4 gene contains a GT 
dinucleotide repeat prone to undergo expansion based on the number of repeats found in 
clones SNB#1 and R88460, 32 and 23, respectively. Several neurologic disorders have 
been correlated with the expansion of di- or tri-nucleotide repeats similar to those found in 
the PAK4 sequence, suggesting PAK 4 may also be a disease target and that this repeat in 
its 3 'UTR may serve as a diagnostic marker. 

[0564] Multiple EST fragments (AA535791) match the 3 'end of the human PAK4 
gene, but at the time of filing, the inventors believe that none exist in GenBank or the EST 



127 



Attorney Docket No. 034536/1034 



database that match its 5' end. 

cDNA Cloning and Characterization of PAK5 

[0565] The full-length human PAK5 cDNA sequence is composed of two 
overlapping clones, H450#l-1 and SNB8#5. 

[0566] A human PAK5 clone, R18825, was first identified using a Smith- 
Waterman search of the EST database with the human PAK4 gene as a query. Sequence 
analysis of the 1248 bp insert identified a 420 bp ORF (140 amino acids) with the potential 
to encode the C-terminus of a novel human STK related to the human PAK4 gene product. 

[0567] cDNA clone SNB8#5 was isolated from human SNB75 cDNA library using 
a probe derived from R18825. SNB2#1 contains a 2028 bp insert and an ORF of 1 194 bp 
(398 amino acids). 

[0568] The partial human PAK5 cDNA (SEQ ED NO.28) is 2028 bp long and 
consists of a 1 194 bp ORF (2-1 195) flanked by an 833 bp (1 196-2028) 3 'UTR followed by 
a 22 nucleotide polydenylated region. A potential polyadenylation signal (AATTAAA) is 
found at positions 2004-2010. Since the coding region is open throughout the 5' extent of 
this sequence, this is apparently a partial cDNA clone lacking the N-terminal start 
methionine. 

[0569] Clone H460#l-1 was isolated from a human lung H460 cDNA library using 
a probe derived from the partial SNB2#1 cDNA clone described above. Sequence analysis 
of the 2526 bp insert identified a 1773 bp ORF (592 amino acids) with the potential to 
encode a full-length PAK5. 

[0570] The human PAK5 cDNA (SEQ ID NO:102) is 2,806 bp long and consists 
of a 1,773 bp ORF flanked by a 201 bp 5' UTR and a 833 bp 3* UTR. The sequence 
flanking the first ATG conforms to the Kozak consensus (Kozak, M., Nucleic Acids Res. 
15, 8125-8148 (1987)) for an initiating methionine, and is believed to be the translational 
start site for PAK5. 

[0571] PAK5 shares 99% sequence identity over 2795 bp to a recent database 
entry, AF005046. These sequences are presumed to be from the same gene, with minor 
polymorphic variations. 
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EXAMPLE 2 : Expression Analysis of Mammalian STE20-related Protein Kinases 
Materials and Methods 
Northern blot analysis 

[0572] Northern blots were prepared by running 10 jig total RNA isolated from 60 
human tumor cell lines (HOP-92, EKVX, NCI-H23, NCI-H226, NCI-H322M, NCI-H460, 
NCI-H522, A549, HOP-62, OVCAR-3, OVCAR-4, OVCAR-5, OVCAR-8, IGROV1, SK- 
OV-3, SNB-19, SNB-75, U251, SF-268, SF-295, SF-539, CCRF-CEM, K-562, MOLT-4, 
HL-60, RPMI 8226, SR 3 DU-145, PC-3, HT-29, HCC-2998, HCT-1 16, SW620, Colo 205, 
HTC15, KM-12, UO-31, SN12C, A498, CaKil, RXF-393, ACHN, 786-0, TK-10, LOX 
IMVI, Malme-3M, SK-MEL-2, SK-MEL-5, SK-MEL-28, UACC-62, UACC-257, M14, 
MCF-7, MCF-7/ADR RES, Hs578T, MDA-MB-231, MDA-MB-435, MDA-N, BT-549, 
T47D), from 22 human adult tissues (thymus, lung, duodenum, colon, testis, brain, 
cerebellum, cortex, salivary gland, liver, pancreas, kidney, spleen, stomach, uterus, 
prostate, skeletal muscle, placenta, mammary gland, bladder, lymph node, adipose tissue), 
and 2 human fetal normal tissues (fetal liver, fetal brain ), on a denaturing formaldehyde 
1 .2% agarose gel and transferring to nylon membranes. 

[0573] Filters were hybridized with random primed [oc 32 P]dCTP-labeled probes 
synthesized from the inserts of several of the STE20-related kinase genes. Hybridization 
was performed at 42 °C overnight in 6X SSC, 0.1% SDS, IX Denhardt's solution, 100 
|Lig/mL denatured herring sperm DNA with 1-2 x 10 6 cpm/mL of 32 P-labeled DNA probes. 
The filters were washed in 0.1 X SSC/0.1% SDS, 65 °C, and exposed on a Molecular 
Dynamics phosphorimager. 

Quantitative PCR analysis 

[0574] RNA was isolated from a variety of normal human tissues and cell lines. 
Single stranded cDNA was synthesized from 10 Dg of each RNA as described above using 
the Superscript Preamplification System (GibcoBRL). These single strand templates were 
then used in a 25 cycle PCR reaction with primers specific to each clone. Reaction 
products were electrophoresed on 2% agarose gels, stained with ethidium bromide and 
photographed on a UV light box. The relative intensity of the STK-specific bands were 
estimated for each sample. 
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DNA Array Based Expression Analysis 

[0575] Plasmid DNA array blots were prepared by loading 0.5 Dg denatured 
plasmid for each STE20-related kinase on a nylon membrane. The [or 32 P]dCTP labeled 
single stranded DNA probes were synthesized from the total RNA isolated from several 
human immune tissue sources or tumor cells (thymus, dendrocytes, mast cells, monocytes, 
B cells (primary, Jurkat, RPMI8226, SR), T cells (CD8/CD4+, TH1, TH2, CEM, 
MOLT4), K562 (megakaryocytes). Hybridization was performed at 42 °C for 16 hours in 
6X SSC, 0.1% SDS, IX Denhardt's solution, 100 |ig/mL denatured herring sperm DNA 
with 10 6 cpm/mL of [c^ 2 P]dCTP labeled single stranded probe. The filters were washed in 
0.1X SSC/0.1% SDS, 65 °C, and exposed for quantitative analysis on a Molecular 
Dynamics phosphorimager. 

RESULTS 

Distribution of STE20-Related Gene Transcripts in Normal Tissues and Tumor Cell Lines 

[0576] ZC1, ZC2, and ZC3 RNA expression was analyzed by quantitative PCR 
from multiple human normal tissues, cultured primary epithelial and endothelial cells, and 
tumor cell lines. The results are summarized in Tables 1 and 2, with relative expression 
values ranging from 0 (undetectable) to 23 (very strong). An "x" refers to sample not 
tested. ZC1, ZC2, and ZC3 were all expressed at very low levels in most normal human 
tissues, however ZC1 and ZC2 were more abundant in cultured epithelial cells and ZC3 in 
normal kidney and breast tissue. 

[0577] Expression of these 3 genes was also examined in a panel of human tumor 
cell lines representing a diverse sampling of tumor types (Table 2). ZC1 and ZC2 showed 
strong expression in cell lines from most melanomas and renal tumors and from some non- 
small cell lung cancers and colon tumors. ZC3 expression was consistently lower in the 
tumor cell lines except for high expression in most breast cancers and leukemias. The 
robust overexpression ZC1, ZC2, and ZC3 in tumor cells versus normal tissues may 
provide an attractive target for oncology drug development. 

[0578] Expression of all the novel STE20-related kinases was examined in a panel 
of human immune tissues/cells by hybridization to a DNA array blot containing plasmids 
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encoding each of these genes. STLK2 was broadly expressed in all 14 immune samples, 
whereas STLK4 and PAK4 were highly expressed in a subset of 6-7 of the samples (Table 
3). Several other kinases (SULU3, ZC4, KHS2) had more restricted expression, while 
others were expressed in only a single immune source (STLK3, thymus; ZC1, dendrocytes; 
ZC3, monocytes; PAK5, mast cells and MOLT4), and several more were absent from all 
the immune sources assayed (GEK2, SULU1, ZC2, STLK5). These expression patterns 
were quite distinct among members of the same subfamily (i.e., ZC1, ZC2, ZC3 and ZC4, 
or PAK1, PAK2, PAK3, PAK4, PAK5). This analysis suggests that some of these kinases 
may be candidate targets for various immune disorders, and that some, which are more 
broadly expressed, may mediate functions vital to the basic biology of most proliferating 
cells. 

TABLE 1 

ZC 1 , ZC2 and ZC3 Expression 



in Normal Human Tissues and Cells 



Sample 




ZC1 


ZC2 


ZC3 


NORMAL 










Brain 


Tiss 


2.8 


0.6 


0.9 


Duod 


Tiss 


I 3.8 


1.5 


0.3 


Heart 


Tiss 


1.2 


0.3 


0.0 


Kidney 


Tiss 


0.7 


0.0 


7.0 


Lung 


Tiss 


1.6 


0.2 


0.0 


Pancreas 


Tiss 


2.0 


0.4 


2.5 


Placenta 


Tiss ' 


1.4 


0.0 


0.0 


Sal gl. 


Tiss 


3.0 


0.3 


3.2 


Sk mus. 


Tiss 


2.3 


0.1 


0.1 


Spleen 


Tiss 


0.4 


0.0 


X 


Stomach 


Tiss 


0.8 


0.0 


0.0 


Thymus 


Tiss 


3.5 


0.4 


1.5 


Cereb 


Tiss 


2.8 


1.1 


4.4 
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Liver 


Tiss 


1.8 


0.0 


0.4 


Uterus 


Tiss 


1.6 


0.0 


1.4 


Prostate 


Tiss 


1.4 


0.0 


1.6 


Testis 


Tiss 


X 


X 


5.8 


f Brain 


Tiss 


X 


X 


3.1 


Mam gl 


Tiss 


X 


X 


7.2 


HCAEC 


ENDO 


1.0 


0.0 


0.0 


HMVEC-d 


ENDO 


0.7 


0.0 


0.4 


HMVEC-L 


ENDO 


2.2 


1.6 


1.8 


HPAEC 


ENDO 


9.3 


5.3 


6.4 


HMEC 


EPI 


4.1 


2.3 


1.9 


RPTEC 


EPI 


3.6 


2.2 


0.2 


HRCE 


EPI 


5.3 


3.5 


1.3 


HSAE 


EPI 


0.9 


3.3 


4.8 



TABLE 2 



ZC1, ZC2 and ZC3 Expression in Tumor 



Sample 


Origin 


ZC1 


ZC2 


ZC3 


HOP-92 


Lung 


9.3 


7.2 


3.3 


EKVX 


Lung 


10.7 


3.7 


3.5 


NCI-H23 


Lung 


5.8 


6.3 


4.1 


NCI-H226 


Lung 


6.5 


6.8 


3.3 


NCI-H322M 


Lung 


3.5 


5.8 


4.9 


NCI-H460 


Lung 


4.5 


3.7 


2.9 


NCI-H522 


Lung 


4.7 


3.3 


4.6 


A549/ATCC 


Lung 


3.8 


3.6 


4.1 
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lLnes 



Sample 


Origin 


ZC1 


ZC2 


ZC3 


HCC-2998 


Colon 


2.4 


3.8 


3.0 


HCT 116 


Colon 


2.2 


2.1 


5.4 


SW-620 


Colon 


7.8 


12.1 


3.1 


COLO 205 


Colon 


9.1 


16.2 


3.0 


HCT-15 


Colon 


13.8 


4.9 


2.5 


KM-12 


Colon 


7.0 


13.2 


3.1 


UO-31 


Colon 


10.4 


10.6 


0.9 


SN12C 


Renal 


8.1 


3.4 


2.8 
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HOP-62 


Lung 


4.3 


3.8 


4.2 


OVCAR-3 


Ovary 


2.9 


3.1 


1.5 


OVCAR-4 


Ovary 


3.3 


1.0 


3.8 


OVCAR-5 


Ovary 


2.6 


3.6 


2.2 


OVCAR-8 


Ovary 


3.6 


2.0 


4.7 


IGROV1 


Ovary 


3.8 


1.7 


3.2 


SK-OV-3 


Ovary 


4.9 


0.0 


3.5 


SNB-19 


CNS 


5.1 


5.4 


4.2 


SNB-75 


CNS 


2.5 


0.9 


0.7 


U251 


CNS 


1.5 


1.2 


0.6 


SF-268 


CNS 


5.8 


2.7 


3.0 


SF-295 


CNS 


6.4 


1.1 


3.2 


SF-539 


CNS 


5.1 


2.9 


4.3 


CCRF-CEM 


Leuk 


3.4 


2.7 


3.1 


K-562 


Leuk 


4.1 


6.3 


4.3 


MOLT-4 


Leuk 


7.1 


3.4 


4.2 


HL-60 


Leuk 


X 


X 


0.4 


RPMI 8226 


Leuk 


0.5 


0.2 


1.4 


SR 


Leuk 


3.5 


7.2 


5.4 


DU-145 


Pro 


X 


X 


3.4 


PC-3 


Pro 


X 


X 


3.4 


HT-29 


Colon 


2.4 


5.9 


6.6 



15275.1 



A498 


Renal 


6.2 


3.1 


2.9 


Caki-1 


Renal 


9.2 


14.4 


2.3 


RXF393 


Renal 


10.6 


4.8 


2.8 


ACHN 


Renal 


9.3 


6.0 


3.9 


786-0 


Renal 


8.8 


15.6 


5.6 


TK-10 


Renal 


20.9 


21.2 


5.0 


LOX IMVI 


Mel 


2.3 


2.4 


3.3 


Malme-3M 


Mel 


X 


X 


2.2 


SK-MEL-2 


Mel 


15.7 


14.1 


2.9 


SK-MEL-5 


Mel 


7.9 


7.0 


0.0 


SK-MEL-28 


Mel 


16.5 


23.1 


0.0 


UACC-62 


Mel 


12.1 


18.3 


5.3 


UACC-257 


Mel 


10.8 


9.4 


6.2 


M14 


Mel 


4.4 


0.9 


7.9 


MCF7 


Breast 


4.8 


1.3 


7.7 


MCF-7/ADR 


Breast 


8.8 


3.4 


7.7 


Hs 578T 


Breast 


6.9 


2.6 


5.7 


MDA-MB-231 


Breast 


5.7 


1.9 


6.4 


MDA-MB-435 


Breast 


4.8 


6.7 


9.1 


MDA-N 


Breast 


7.3 


6.3 


9.1 


BT-549 


Breast 


3.6 


1.9 


8.0 


T-47D 


Breast 


0.4 


12.3 


9.3 
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Table 3: STE20-related kinase expression in a human immune panel 



KINASE 


thymus 


Dendro- 

cytes 


Mast 
cells 


Mono- 
cytes 


B 

cells 


CD8+ 
CD4+ 


TH1 


TH2 


GEK2 


350 


350 


350 


350 


350 


350 


350 


350 


SULU1 


350 


350 


350 


350 


350 


350 


350 


350 


SULU3 


350 


350 


350 


350 


12149 


350 


5115 


350 


STLK2 


117770 


13771 


27620 


92036 


18305 


39109 


5408 


3564 


STLK3 


8624 


350 


350 


350 


350 


350 


350 


350 


STLK4 


8524 


350 


350 


350 


350 


8685 


5642 


350 


STLK5 


XXX 


XXX 


XXX 


XXX 


350 


350 


350 


XXX 


ZC1 


350 


3377 


350 


350 


350 


350 


350 


350 


ZC2 


350 


350 


350 


350 


350 


350 


350 


350 


ZC3 


350 


350 


350 


20156 


350 


350 


350 


350 


ZC4 


XXX 


XXX 


XXX 


XXX 


350 


350 


350 


XXX 


KHS2 


8766 


2508 


350 


56575 


350 


350 


350 


350 


PAK4 


32658 


7684 


3729 


100948 


350 


350 


350 


1604 


PAK5 


350 


350 


4905 


350 


350 


350 


350 


350 



KINASE 


CEM 
(T cell) 


MOLT4 
(T cell) 


JURKAT 
(B cell) 


RPMI8226 
(B cell) 


SR 

(B cell) 


K562 
(MO) 


GEK2 


350 


350 


350 


350 


350 


350 


SULU1 


350 


350 


350 


350 


350 


350 


SULU3 


350 


350 


350 


350 


350 


350 


STLK2 


47236 


53262 


47605 


22560 


65936 


30390 


STLK3 


350 


350 


\ 350 


350 


350 


350 


STLK4 


3648 


350 


26772 


1570 


350 


350 


STLK5 


350 


350 


350 


XXX 


350 


350 


ZC1 


350 


350 


350 


350 


350 


350 
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ZC2 


350 


350 


350 


350 


350 


350 


ZC3 


350 


350 


350 


350 


350 


350 


ZC4 


1094 


7813 


14945 


XXX 


350 


6385 


KHS2 


350 


350 


350 


350 


350 


350 


PAK4 


350 


10246 


350 


3229 


350 


350 


PAK5 


350 


12672 


350 


350 


350 


350 



Transcript size from Northern data 



Kinase 


(kb) 


STLK2 


3.8 


STLK4 


5.0 


ZC1 


6.9/4.7 


ZC2 


6.0/8.0 


ZC4 


5 


KHS2 


4.4 


SULU1 


4.5 


SULU3 


10.0 


GEK2 


5.5 


PAK4 


4.8 


PAK5 


3.5 



[0579] STLK2is widely expressed; the highest expression levels were found in 
placenta, spleen and PBL. 

[0580] STLK4 is also widely expressed in normal tissues including heart, brain, 
placenta, lung, liver, skeletal muscle, kidney, pancreas, spleen, thymus, prostate, testis, 
ovary, small intestine, colon, and peripheral blood lymphocytes. STLK4 was also detected 
in Jurkat T cells. 

[0581] ZC1 is highly overexpressed in the following human cancer cell lines: 
HOP-92, EKVX, NCI-H23, NCI-H226, NCI-H322M, NCI-H522, A549, HOP-62 (lung); 
OVCAR-3, OVCAR-4, OVCAR-5 (ovary); SNB-19, U251, SF-268, SF-295, SF-539 
(CNS); K-562, RPMI-8226 (leukemia); DU-145, PC-3 (prostate); HT-29, HCC-2998, 
HCT-116, SW620, COLO-205, HCT-15, KM- 12 (colon); UO-31, CAKi-1, RXF-393, 786- 
0, TK-10 (renal); LOXIMVI, Malme-3M, SK-MEL-2, SK-MEL-28, UACC-62, UACC- 
257, M14 (melanoma); and MCF-7, MCF-7/ADR, HIS 578T, MDA-MB-231, MDA-MB- 

431, MDA-N, BT-549, T-47D (breast). 
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[0582] ZC2 is expressed in brain and testis. It is highly overexpressed in the 
following human cancer cell lines: TK-10 (renal); SK-MEL-28, UACC-62 (melanoma); 
T47D (breast). 

[0583] Moderate expression in HOP92 (lung); OVCAR4, IGROV1 (ovary); 
DNB75, U251 (brain); K-562 (leukemia); and COLO205 (colon). 

[0584] SULU1 is overexpressed in the following human cancer cell lines: HOP-92, 
EKVX, NCI-H23, NCI-H226, NCI-H322M, NCI-H522, A549, HOP-62 (lung); OVCAR- 
3, OVCAR-4, OVCAR-5, SK-OV-3 (ovary); SNB-19, U251, SF-268, SF-295, SF-539 
(CNS); K-562, RPMI-8226 (leukemia); DU-145, PC-3 (prostate); HT-29, HCC-2998, 
HCT-116, SW620,. COLO-205, HCT-15, KM-12 (colon); UO-31, CAKi-1, RXF-393, 
786-0, TK-10 (renal); LOX, EMVI, Malme-3M, SK-MEL-2, SK-MEL-28, UACC-62, 
UACC-257, M14 (melanoma); MCF-7, MCF-7/ADR, HIS 578T, MDA-MB-231, MDA- 
MB-431, MDA-N, BT-549, T-47D (breast) 

[0585] SULU3 showed a broad pattern of expression in the normal tissue panel of 

RNAs. 

[0586] GEK2 was expressed in spleen, thymus and testis. Expression was high in 
the cell lines RBL-2H3 and H441. 

[0587] PAK4 was expressed in the normal tissues: brain, testis and prostate, and in 
the human cancer cell lines: HNCI-H23 (lung); OVCAR-3 (ovary); SNB-19, U251 (CNS); 
RPMI-8226 (leukemia); DU-145 (prostate); COLO-205, HCT-15 (colon). 

[0588] PAK5 showed weak expression levels in the normal tissues: brain, testes, 
bladder, colon, adrenal medulla, spleen, fetal liver, breast, cerebral cortex, cerebellum, 
thymus, salivary gland, lung, stomach, duodenum, uterus, prostate, skeletal muscle and 
placenta. PAK5 was overexpressed in the human cancer cell lines: HOP-92, EKVX, NCI- 
H23, NCI-H226, NCI-H322M, NCI-H522, A549, HOP-62 (lung); OVCAR-3, OVCAR-4, 
OVCAR-5, SK-OV-3 (ovary); SNB-19, U251, SF-268, SF-295, SF-539 (CNS); K-562, 
RPMI-8226 (leukemia); DU-145, PC-3 (prostate); HT-29, HCC-2998, HCT-116, SW620,. 
COLO-205, HCT-15, KM-12 (colon); UO-31, CAKi-1, RXF-393, 786-0, TK-10 (renal); 
LOXEV1VI, Malme-3M, SK-MEL-2, SK-MEL-28, UACC-62, UACC-257, M14 
(melanoma); MCF-7, MCF-7/ADR, HIS 578T, MDA-MB-231, MDA-MB-431, MDA-N, 
BT-549, T-47D (breast). 
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EXAMPLE 3 : STE20-related Protein Kinase Gene Expression Vector Construction 

Materials and Methods 
Expression Vector Construction 

[0589] Several expression constructs were generated for some of the human 
STE20-related cDNAs including: a) full-length clones in a pCDNA expression vector; b) 
a GST-fusion construct containing the catalytic domain of the novel STE20-related kinase 
fused to the C-terminal end of a GST expression cassette; and c) a full-length clone 
containing a Lys to Ala (K to A) mutation at the predicted ATP binding site within the 
kinase domain, inserted in the pCDNA vector. 

[0590] The "K to A" mutants of the STE20-related kinase might function as dominant 
negative constructs, and will be used to elucidate the function of these novel STKs. 

RESULTS 

[0591] Constructs for ZC1, ZC2, ZC3, SULU1, SULU3, PAK4 and PAK5 have been 
generated. 

[0592] Numerous additional constructs have been generated for the various 
STE20-subfamily kinases, including full length, kinase inactive and tagged versions. In 
addition, the following three constructs were designed for specific applications based on 
their unique domain structure: 

Construct 1: SULUl-coiled-coil2 

Vector: pGEX-4T 

Insert: Coiled-coil2 

Sequence: Amino acids 752-898 

Purpose: phage display 

Result: Interacts with GEK2 CC1 

Construct 2: SULU3-coiled-coil2 
Vector: pGEX4T 

Insert: coiled-coil 2 domain fused to GST 
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Sequence range of insert: amino acids 802-898 of SEQ 
Purpose: phage display 

Result: Interacts with coiled-coiled region of human SLK 

Construct 3 : P AK5 Dominant Negative 
Vector: pCAN5 

Insert: Full length coding sequence of human PAK5 containing the following mutation: 

K350,351A (Lys at aa positions 350 and 351 changed to Ala). 

Purpose: to determine role of human PAK5 kinase activity in cell growth and 

transformation. 

Result: Interferes with Ras transformation. 

EXAMPLE 4: Generation of Specific Immunoreagents to 
STE20-Related Protein Kinases 

Materials and Methods 

[0593] Specific immunoreagents were raised in rabbits against KLH- or MAP- 
conjugated synthetic peptides corresponding to the human STE20-related kinases. C-terminal 
peptides were conjugated to KLH with glutaraldehyde, leaving a free C-terminus. Internal 
peptides were MAP-conjugated with a blocked N-terminus. Additional immunoreagents can 
also be generated by immunizing rabbits with the bacterially expressed GST-fusion proteins 
containing the cytoplasmic domains of each novel STK. 

[0594] The various immune sera are first tested for reactivity and selectivity to 
recombinant protein, prior to testing for endogenous sources. 

Western blots 

[0595] Proteins in SDS PAGE are transferred to immobilon membrane. The 
washing buffer is PBST (standard phosphate-buffered saline pH 7.4 + 0.1% triton x 100). 
Blocking and antibody incubation buffer is PBST +5% milk. Antibody dilutions varied 
from 1:1000 to 1:2000. 

RESULTS 
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[0596] Three SULU1 antisera (against both 539A (SEQ ID NO: 79) and 540A 
(SEQ ID NO: 78)) and two SULU3 antisera (542A) (SEQ ID NO: 81) reacted specifically 
with the peptide antigens. Antisera binding was competable with peptide. Experiments 
with extracts from cells transfected with epitope- tagged SULU1 and SULU3 genes are 
underway. 

[0597] Antisera against the PAK4 C-terminal peptide 554A (SEQ ED NO: 82) 
reacted with purified Gst-PAK4 and detected a protein of the correct molecular weight 
from tissue culture cells. Specific immunoprecipitation experiments are ongoing to 
determine the reactivity with native protein. 

[0598] Similar immunization and antisera testing experiments are underway for 
each of the other novel STE20-kinases. 



STE20-related protein kinase peptide immunogens and their specificity in recognizing 
endogenous protein by Western blots or immunoprecipitations. 



Protein 


Sequence 


Aa positions 


Conj 


West. 


IP 


STLK2 


EKFQKCSADESP 
(SEQ ID No: 111) 


405-416 


KXH 


Y 


Y 


STLK4 


SISNSELFPTTDPVGT 
(SEQ ID NO: 112) 


252-267 


KLH 


Y 


Y 


SULU1 


LDFPKEDYR 
(SEQ ID NO: 113) 


890-898 


KXH 


Y 


Y 


SULU1 


HGDPRPEPRPTQ 
(SEQ ID NO: 114) 


409-420 


KLH 


Y 


Y 


SULU3 


PSTNRAGSLKDPEC 
(SEQ ID NO: 115) 


2-14 


KLH 


N 


ND 


SULU3 


. DPRTRASDPQSPPQVSRH 
K 

(SEQ ID NO: 116) 


411-429 


KLH 


ND 


ND 












PAK4 


CLVPLIQLYRKQTSTC 
(SEQ ID NO: 117) 


666-680 


KLH 


ND 


Y 


PAK5 


PLMRQNRTR 
(SEQ ID NO: 118) 


390-398 


KLH 


Y 


Y 


PAK5 


S GDRRRAGPEKRPKS S 
(SEQ ID NO: 119) 


148-163 


KLH 


Y 


Y 


PAK5 


(C) RRKSLVGTPYWMAPE 
(SEQ ID NO: 120) 


471-485 


KLH 


Y 


ND 



ND=not done yet 



STE20-related protein kinase GST fusion protein immunogens and their specificity in 
recognizing endogenous protein by Western blots or immunoprecipitations. 
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Protein 


domain 


Aa positions 


West. 


IP 


ZC1 


Coiled-coil/pro/B/C 


350-867 


Y 


Y 


ZC1 


B 


615-732 


Y 


Y 


ZC2 


Coiled-coil /pro/B 


348-762 


ND 


ND 


ZC2 


B 


658-762 


Y 


Y 


PAK4 


Nterm 


252-426 


ND 


ND 


PAK4 


Kinase/Cterm 


350-681 


ND 


Y 


PAK5 


AJ Nterm 


53-330 


ND 


ND 


PAK5 


A/Nterm 


53-309 


ND 


ND 



ND=not done yet 



[0599] The 50kD STLK2 protein was expressed highly in several hematopoietic 
cell lines including Jurkat, pGLlO, Ramos, A20, WEHI-231, K562, HEL and freshly 
isolated thymocytes from C57/BL6 mice. High levels of STLK2 expression were also 
detected in several tumor cell lines including Calu6, Colo205j LSI 80, MDAM231 and 
A549. 

[0600] The 160 kD ZC1 protein was detected in Jurkat T cells, Colo205, HCT1 16, 
RIE-1, 293T, MDAMB231, and SK-MEL28. 

[0601] The 170 kD ZC2 protein was detected in SK-Mel28 and UACC-62. 

[0602] Elevated levels of the 64 kD PAK5 protein were confirmed in the breast 
cancer cell lines MDA-23 1 and MCF-7, and in the lung cancer cell line A549. 

Example 5 : Recombinant Expression and Biological Assays for STE20-related 
Protein Kinases 
Materials and Methods 

Transient Expression of the Ste20-related Kinases in Mammalian Cells 

[0603] The pcDNA expression plasmids (10 jig DNA/100 mm plate) containing 
the STE20-related kinase constructs are introduced into 293 cells with lipofectamine 
(Gibco BRL). After 72 hours, the cells are harvested in 0.5 mL solubilization buffer (20 
mM HEPES, pH 7.35, 150 mM NaCl, 10% glycerol, 1% Triton X-100, 1.5 mM MgCl 2 , 1 
mM EGTA, 2 mM phenylmethylsulfonyl fluoride, 1 jig/mL aprotinin). Sample aliquots 
were resolved by SDS polyacrylamide gel electrophoresis (PAGE) on 6% acrylamide/0.5% 
bis-acrylamide gels and electrophoretically transferred to nitrocellulose. Non-specific 
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binding was blocked by preincubating blots in Blotto (phosphate buffered saline 
containing 5% w/v non-fat dried milk and 0.2% v/v nonidet P-40 (Sigma)), and 
recombinant protein was detected using the various anti-peptide or anti-GST-fusion 
specific antisera. 

In Vitro Kinase Assays 

[0604] Three days after transfection with the STE20-related kinase expression 
contracts, a 10 cm plate of 293 cells was washed with PBS and solubilized on ice with 2 
mL PBSTDS containing phosphatase inhibitors (10 mM NaHP0 4 , pH 7.25, 150 mM 
NaCl, 1% Triton X-100, 0.5% deoxycholate, 0.1% SDS, 0.2% sodium azide, 1 mM NaF, 1 
mM EGTA, 4 mM sodium ortho vanadate, 1% aprotinin, 5 |ig/mL leupeptin). Cell debris 
was removed by centrifugation (12000 x g, 15 min, 4 °C) and the lysate was precleared by 
two successive incubations with 50 (iL of a 1:1 slurry of protein A sepharose for 1 hour 
each. One-half mL of the cleared supernatant was reacted with 10 |iL of protein A purified 
kinase-specific antisera (generated from the GST fusion protein or antipeptide antisera) 
plus 50 jiL of a 1 : 1 slurry of protein A-sepharose for 2 hr at 4 °C. The beads were then 
washed 2 times in PBSTDS, and 2 times in HNTG (20 mM HEPES, pH 7.5/150 mM 
NaCl, 0,1% Triton X-100, 10% glycerol). 

[0605] The immunopurified kinases on sepharose beads were resuspended in 20 
HL HNTG plus 30 mM MgCl 2 , 10 mM MnCl 2 , and 20 ^iCi [a 32 P]ATP (3000 Ci/mmol). 
The kinase reactions were run for 30 min at room temperature, and stopped by addition of 
HNTG supplemented with 50 mM EDTA. The samples were washed 6 times in HNTG, 
boiled 5 min in SDS sample buffer and analyzed by 6% SDS-PAGE followed by 
autoradiography. Phosphoamino acid analysis was performed by standard 2D methods on 
32 P-labeled bands excised from the SDS-PAGE gel. 

[0606] Similar assays were performed on bacterially expressed GST-fusion 
constructs of the kinases. 

[0607] ZC1 Assay buffer : 20 mM Tris pH 7.4, 200 mM NaCl, 0.5 mM DTT, 3 
mM MgC12, 0.3 mM MnC12, 100^iM 32 PyATP. 

[0608] Substrates: myelin basic protein (MBP) at 0.28 mg/mL and phosphorylated 
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ZC1 peptide RTVGRRNTFIGT-PPYWMAPE(SEQ ID NO:121) at 17 |iM (bold 
underlined residue shows site of phosphorylation). 

[0609] At higher concentrations of MgCl 2 (3 mM), the activity of ZC1 (both full- 
length and recombinant kinase domain) is up to 10- fold greater towards exogenous 
substrate MBP. In contrast, the autophosphorylation and the phosphorylation of the 
activation loop peptide substrate are both inhibited. Mn++ does not inhibit the 
autophosphorylation and the peptide phosphorylation by the truncated kinase domain form. 
However, both the MBP phosphorylation, Mn++-preferring activity AND the 
autophosphorylating, Mg++-preferring activity are eliminated with mutation of the ATP- 
binding lysine in ZC1 (Lys54Ala) indicating that both activities are attributable to the ZC1 
kinase domain. 

[0610] SULU1 Assay buffer : This buffer is identical to that for ZC1, except for 5 
mM MgC12. Under these conditions, other STE20 family members (PAK4, ZC1) were 
inhibited for autophosphorylation and required reducing the [Mn] to <0.3 mM for an 
efficient autophosphorylation reaction. 

[061 1] Substrates: MBP, phosvitin, or ocasein at 0.28 mg/mL. 

[0612] PAK4, PAK5 Assay Buffer : 20mM Hepes pH 7.2, 130 mM KC1, 10 mM 
MgC12, 1 mM NaF, 20 mM B-glycerolphosphate, 0.5 mM DTT, 50 jiM ATP, 0.5 jiCi 
32 PYATP. 

[0613] Substrates: MBP at 0.28 mg/mL and peptide substrates derived from PAK5 
activation loop at 2.5 jiM. 

[0614] STLK2 Assay buffer : Similar to that described above, except for the 
inclusion of 5 mM MgCl 2 , 5 mM MnCl 2 and 5 ^Ci 32 P7ATP. 

Transformation (PAK experiments) 

[0615] Low-passage NTH3T3 fibroblasts displaying normal morphology (flat, non- 
refractile cellular morphology), as well as low rates of spontaneous transformation, were 
used in transformation assays. NIH3T3 cells were maintained in Dulbecco's modified 
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Eagle's medium supplemented with 10% (v/v) fetal calf serum, penicillin (100 U/mL) and 
streptomycin (100 U/mL) and kept in an humidified incubator at 37 °C and 5% CO2. 

[0616] Cells were transfected with DNA-lipid complexes. As per manufacturer 
instructions, lipofectamine was utilized to transfect NIH3T3 cells. All transfections were 
with equal amounts of plasmid DNA (DNA from the appropriate expression vector 
without insert was used to give equivalent amounts of DNA per transfection). 1 jig of 
activated allele of H-Ras was co-transfected with increasing amounts of various alleles of 
PAK5. 

[0617] Foci were scored after 3 weeks by fixing 10 min in 10% methanol, 10% 
acetic acid for 10 min, followed by staining with 0.4% (w/v) crystal violet in 10% 
methanol for 10 min, and washing with deionized water and drying at room temperature. 

Transfections, stimulations, and luciferase assays fZCl experiments) 

[0618] Cells (10 7 ) were transiently transfected by electroporation using a Gene 
Pulser (Bio-Rad Labs) with the setting of 960 _F and 250 V. 20-40 hours later, transfected 
cells (about 10 5 ) were stimulated with various stimuli. After a 6-hour stimulation, cells 
were lysed, and luciferase activities were measured using the MicroLumatPlus (EG&G 
Berthold). (J. Exp. Med. 183:61 1-620, 1996, hereby incorporated by reference herein in its 
entirety including any drawings, tables, or figures.) 



RESULTS 

Protein expression and kinase activity of novel STE20-related protein kinases 



Protein 


Observed size 


Predicted 


In vitro Kinase 


Endogenous 




(kD) 


Size(kD) 


activity 


Kinase 
activity 


STLK2 


50 


46 


y 


y 


STLK4 


55 


50 


y 


ND 


ZC1 


160 


140 


y 


y 


ZC2 


170 


150 


y 


y 


KHS2 


ND 


101 


ND 


ND 


SULU1 


119 


105 


y 


y 


SULU3 


140 


115 


ND 


Y 


PAK4 


80 


75 


y 


y 


PAK5 


64 


64 


y 


y 
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ZC1: Regulation of kinase activity 

[0619] ZC1 is constitutively active as a full-length kinase when expressed either in 
vitro (TNT rabbit reticulocyte system) or in NIH 3T3, 293T, or H1299 tissue culture cells. 
The endogenously expressed kinase is also active when immunoprecipitated from 
carcinoma cell lines. 

ZC1 signaling Pathways 

[0620] Using human leukemic T cell line Jurkat as a model system, the impact of 
cotransfected wild-type ZC1 on the activation of two reporter genes, RE/AP-luciferase and 
NFkB luciferase, was examined. RE/AP is a composite in the LL-2 gene promoter 
containing both a NFicB-like site and an AP-1 site. 

[0621] Optimal activation of both RE/AP-luciferase and NFicB-luciferase reporter 
genes in Jurkat T cells requires signals generated from stimulation of both T cell receptor 
and the costimulator receptor CD28. Cotransfection of wild-type ZC1 with either the 
RE/AP-luciferase or the NFicB-luciferase reporter results in the activation of RE/AP or 
NFkB when costimulated with the anti-T cell receptor monoclonal antibody or the 
pharmacological reagents PMA and ionomycin that bypass proximal T cell receptor. No 
activation was seen when costimulated with an anti-CD28 monoclonal antibody. 

[0622] These results suggest that wild- type ZC1, when overexpressed, was 
replacing a CD28-specific signal to activate RE/AP or NFkB. These results imply that 
ZC1 is involved in the CD28 signaling pathway. Since NFkB is one of the major 
pathways also activated by the pro-inflammatory cytokine TNF-ot signaling, it is also 
likely that ZC1 may be a component in the TNF-a signaling pathways. 

PAK5: Design of specific peptide substrates 

[0623] To aid in the development of in vitro kinase assays for screening small 
molecule libraries to identify specific inhibitors, the search for specific peptide substrates 
for PAK5 was undertaken. 

[0624] The rationale used to design such peptides is based on the hypothesis that 
upon binding activated small G protein, PAK5 undergoes a conformational change that 
results in derepression of its kinase activity followed by autophosphorylation on the 
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activation loop resulting in a fully active kinase. The site of autophosphorylation for 
related family members has been identified by biochemical and/or genetic means (e.g. Wu, 
C, et al. J.Biol.Chem 270:15984-15992 and Szczepanowska, et aL Proc.Natl.Acad.Sci 94, 
8503-8508, 1997). Specific peptide substrates for PAK5 were designed from the sequence 
of the activation loop of this kinase. 

[0625] An activation loop PAK5 peptide phosphorylated on the Thr residue of the 
TPY motif served as a high-affinity substrate for PAK5. 



PAK5 activation loop peptides as kinase substrates 



Peptide 

# 


Kinase 


Sequence 


Aa 


SEQ 
ID 


Kinase 


substrate 


1 


PAK5 


(C)RRKSLVGTPYWMA 
PE 


471-485 


120 


PAK5 


yes 


2 


PAK5 


(C)RRKSLVGTPYWMA 
PE 


471-485 


120 


PAK5 


yes 


3 


PAK5 


(C)RRKSLVGTPYWMA 
PE 


471-485 


120 


PAK5 


no 


4 


KHS1 


KRKSFIGTPYWMAPE 


171-185 


122 


PAK5 


yes 


5 


STLK2 


KRNTFVGTPFWMA 
PE 


175-189 


123 


PAK5 


poor 


6 


SULU1 


PANSFVGTPYWMAPE 


174-188 


124 


PAK5 


poor 


7 


ZC1 


RRNTFIGTPYWMAPE 


184-198 


125 


PAK5 


poor 


8 


ZC1 


RRNTFIGTPYWMAPE 


184-198 


126 


PAK5 


poor 


9 


STLK4 


RNKVRKTFVGTPCWM 
APE 


66-83 


127 


PAK5 


poor 


10 


PAK5 


(C)RRKSLVGTPYWMA 
PE 


471-485 


120 


PAK4 


yes 



Note: underlined/ bold reside was phosphorylated 



Peptide 

# 


Kinase 


Notes 


1 


PAK5 


Equally well as MBB 


2 


PAK5 


High Km for PAK5 (1-10 uM) 


3 


PAK5 


S is the site of phosphorylation 


4 


KHS1 


Similar to peptide 1 


5 


STLK2 




6 


SULU1 




7 


ZC1 




8 


ZC1 


Better than 7 


9 


STLK4 
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10 1 PAK5 \ Same Km as phosph. by PAK5 



PAK5: Transformation 

[0626] Transformation of low-passage NIH3T3 cells by ras in the presence or 
absence of various alleles of PAK5 showed that the dominant negative, kinase-dead allele 
of PAK5 was able to block ras transformation of NIH3T3 cells. Thus, PAK5 activity is 
required for ras transformation of NIH3T3 cells. Inhibition of PAK5 activity may have 
therapeutic value as an anti-proliferative agent for treating cancer. 



PAK4 and PAK5: interaction with Cdc42 

[0627] PAK 4 interacts with CDC42 small G-protein but not Rac, RhoA, or Ras as 
determined by co-transfection of recombinant genes and detection by kinase assays. PAK5 
also interacts with Cdc42. Coding sequences of activated alleles of small G proteins (ras, 
Cdc42, Rac, Rho) tagged with a Myc epitope were transiently expressed in 293T cells, 
various alleles of 35S-labeled PAK5 tagged with HA epitope were expressed in vitro with 
the reticulocyte (TNT) system. 



Example 6 : Chromosomal Localization of Ste20-Related Protein Kinases 
Materials And Methods 

[0628] STE20 protein kinases STLK3, STLK4, ZC1, ZC2, ZC3, KHS2, SULU1, 

PAK4, and PAK5 were mapped using the GeneBridge 4 Radiation Hybrid Panel, RH02.05 

(Research Genetics). The GeneBridge 4 Panel consists of 91 hybrid panel samples, in 

addition to one human positive control (HFL), and one hamster negative control (A23). 

The standard reaction conditions used to test and conduct PCR reactions using the 

GeneBridge 4 Panel are available from Research Genetics. 



[0629] Oligonucleotide sequences (all 5' to 3') used for PCR mapping were: 

STLK3: CTCCCATTTCCTAGCAAAATCA(SEQ ID NO: 128), 
AGAGGCAGTATTGTCAGATGTA(SEQ ID NO: 129) 

STLK4: CCACACATGCGTATCTCTGTTG(SEQ ID NO: 1 30), 

TTGCTAGAATTCACATCAGGTACA(SEQ ID NO: 131) 

ZC 1 : ATCCCTGGATC AC ACTGCTTCT(SEQ ID NO: 1 32), 
CAAGGTGTTCTTTGCCTCTGTT(SEQ ID NO: 133) 

ZC2: AGATGGACTGTACTGGGAGGG(SEQ ID NO: 1 34), 
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AGAAGAGCACTTGGCACTTATC(SEQ ID NO: 135) 
ZC3 : C ATC ATGAACTGGTGACGGG(SEQ ED NO: 1 36), 

CCAGTGAAATCAAACCAGTAAAA(SEQ ID NO: 137) 
SULU1 : CAAAACCTGGCCGTCTCTTCTATT(SEQ ID NO: 1 38), 

ATTTGTGCTACTGGGATTCTGTG(SEQ ID NO: 139) 
KHS2: GAATAGCGGTACCATGATAGAATA(SEQ ID NO: 140), 

TACCAAAAAGAGCCAAAAGTGTG(SEQ ID NO: 141) 
PAK4: CTCAGTATTCTCTCCAAAGATTG(SEQ ED NO: 142), 

GATGTTCTCTCCATTCTGTAAAG(SEQ ED NO: 143) 
PAK5: CATCACTGGAAGTCTGCAGTG(SEQ ED NO: 144), 

CAGGTGCAGTAGTCATTTGC(SEQ ED NO: 145) 



[0630] Positive reactions were assigned a score of "1", negative reactions are 
assigned a score of "0", and ambiguous reactions are assigned a score of "2". Results were 
submitted to the Whitehead Institute (www@genome.wi.mit.edu) for position analysis. 
Chromosomal localizations for ZC4, SULU3, STLK2, STLK5 and STLK6 were available 
publicly (for example, from Unigene). The chromosomal locations of GEK2 and STLK7 
have not been determined. 



STLK2 h 


Xq25-27.1 


(Public) 


STLK3 


2q31.3 


(Sugen) 


STLK4 h 


3p22.3-p22.2 


(Sugen) 


STLK5 h 


17q23.2-24.2 


(Public) 


STLK6 h 


2q32.2-q33.3 


(Public) 


STLK7 h 


NA 




ZC1 h 


2pll.2 


(Sugen) 


ZC2 h 


3q26.31-3q26.32 


(Sugen) 


ZC3 h 


17pl3.2-13.3 


(Sugen) 


ZC4 h 


Xq22 


(Public) 


KHS2 h 


2p22-2p22.2 


(Sugen) 


SULU1 h 


12q24.21 


(Sugen) 


SULU3_h 


17pll.2 


(Public) 


GEK2 h 


NA 




PAK4_h 


15ql4 


(Sugen) 


PAK5 h 


19ql3.2-ql3.3 


(Sugen) 



[0631] Many of the STE 20 kinases were mapped to regions associated with 
various human cancers, as shown below. 

[0632] The regions were also cross-checked with the Mendalian Inheritance in 
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Man database, which tracks genetic information for many human diseases, including 
cancer. References for association of the mapped sites with chromosomal abnormalities 
found in human cancer can be found in: Knuutila, et aL, Am J Pathol, 1998, 152:1 107- 
1 123, hereby incorporated herein be reference in its entirety including any figures, tables, 
or drawings. Association of these mapped regions with other diseases is documented in 
the Online Mendalian Inheritance in Man (OMIM). 

STLK2 h , Xq25-27.1, (Public) 
Osteosarcoma, Xq25-qter, 2 of 31. 

Lymphoproliferative syndrome, X-linked (OMIM No. 308240) 

human STLK3, 2q31.3, (Sugen) 

Squamous cell carcinoma of Head and Neck, 3 of 30. 

STLK4 h . 3p22.3-p22.2, (Sugen) 

Mantle cell lymphoma 3pl4-p22 1 of 27 

Squamous cell carcinoma of Head and Neck 3p22-p24 1 of 14 

Cardiomyopathy, dilated (OMIM 601154) 

STLK5 h. 17q23. 2-24.2, (Public) 
Cervical cancer, 17q, 1 of 30 

Gastroesophageal junction adenocarcinoma xenograft, 17q, 1 of 5 

Breast carcinoma, 17ql2-qter, 1 of 16 

Bladder carcinoma, 17q22-q23, 1 of 14 

Breast carcinoma, 17q22-q25, 8 of 101 

Non-small cell lung cancer, 17q24-q25, 6 of 50 

Testis, 17q24-qter, 2 of 11 

Malignant peripheral nerve sheath tumors, 17q24-qter, 5 of 7 
Alzheimer disease, susceptibility to (OMIM 106180) 

STLK6 h , 2q32.2 -q33.3, (Public) 
Non-small cell lung cancer, 2q31-q32, 1 of 50 
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Squamous cell carcinoma of Head and Neck, 2q31-q33, 3 of 30 
Small cell lung cancer, 2q32-q35, 1 of 22 

ZC1 _h . 2pll.2, (Sugen) 

non-small cell lung cancer, 2pter-ql3, 1 of 10 

non-small cell lung cancer, 2pter-q21, 1 of 10 

Pulmonary alveolar proteinosis, congenital (OMIM 178640). 

ZC2 h . 3q26.31-3q26.32, (Sugen) 

Non-small cell lung cancer, 3q26.1-q26.3, 26 of 103 

Cervical cancer, 3q26.1-q27, 4 of 30 

Small cell lung cancer, 3q26.3-qter, 3 of 35 

Squamous cell carcinoma of Head and Neck, 3q26.3-qter, 3 of 13 
Marginal zone B-cell lymphoma, 3q26-q27, 1 of 25 
Parosteal osteosarcoma, 3q26-q28, 1 of 1 
Gastrointestinal stromal tumor, 3q26-q29, 1 of 16 
Mantle cell lymphoma, 3q26-q29, 1 of 5 

ZC3 h 17pl3.2-13.3 (Sugen) 

Malignant fibrous histiocytoma of soft tissue, 17p, 2 of 58 
Leiomyosarcoma, 17p, 7 of 29 
Non-small cell lung cancer, 17p, 1 of 50 

ZC4 ju Xq22, (Public) 

Diffuse large cell lymphoma, Xq22-ter, 1 of 32 
Deafness, X-linked 1, progressive. (OMIM 304700). 

KHS2 h . 2p22-2p22.2, (Sugen) 
Synovial sarcoma, 2p21-ql4, l_of_67 
Follicular lymphoma, 2p22-p24, l_of_46 

Colorectal cancer, hereditary, nonpolyposis, type 1, Ovarian cancer (MSH2, COCA1, 
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FCC1). (OMIM 120435). 

SULU1 h , 12q24.21 (Sugen) 
Neuroglial tumors, 12q22-qter, l_of_15 

Gastroesophageal junction adenocarcinoma, 12q23-qter 5 1 of 5. 
Non-small cell lung cancer, 12q24. 1-24.3, 2 of 50. 

SULU3 h 17pll.2 (Public) 

Malignant fibrous histiocytoma of soft tissue, 17p, 2_of_58 
Leiomyosarcoma, 17p, 7_of_29 
non-small cell lung cancer, 17p, l_of_50 
Diffuse large cell lymphoma, 17pll.2, l_of_32 
Osteosarcoma, 17pll.2-pl2, 4_of_31 

PAK4 h : 15ql4 (Sugen) 
Schizophrenia, (OMIM 118511). 

PAK5 h : 19ql3.2-ql3.3 (Sugen) 
Follicular lymphoma, 19ql3, 1 of 46* 
Mantle cell lymphoma, 19ql3, 1 of 5 
Hepatocellular carcinoma, 19ql3.1,2of50 
Small cell lung cancer, 19ql3.1, 10 of 35 
Breast carcinoma, 19ql3.1-qter, 1 of 33 
cervical cancer, 19ql3.1-qter, 1 of 30 
Testis, 19ql3.1-qter, 1 of 11 
Chondrosarcoma, 19ql3.2, 1 of 29 

Malignant fibrous histiocytoma of soft tissue, 19ql3.2-qter, 2 of 58 
Non- small cell lung cancer, 19qcen-ql3.3, 6 of 104 

Example 7 : Demonstration Of Gene Amplification By Southern Blotting 
Materials and Methods 
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[0633] Nylon membranes were purchased from Boehringer Mannheim. 
Denaturing solution contains 0.4 M NaOH and 0.6 M NaCl. Neutralization solution 
contains 0.5 M Tris-HCL, pH 7.5 and 1.5 M NaCl. Hybridization solution contains 50% 
formamide, 6X SSPE, 2.5X Denhardt's solution, 0.2 mg/mL denatured salmon DNA, 0.1 
mg/mL yeast tRNA, and 0.2 % sodium dodecyl sulfate. Restriction enzymes were 
purchased from Boehringer Mannheim. Radiolabeled probes were prepared using the 
Prime-it II kit by Stratagene. The beta actin DNA fragment used for a probe template was 
purchased from Clontech. 

[0634] Genomic DNA was isolated from 20 different tumor cell lines: MCF-7, 
MDA-MB-231, Calu-6, A549, HCT-15, HT-29, Colo 205, LS-180, DLD-1, HCT-1 16, 
PC3, CAPAN-2, MIA-PaCa-2, PANC-1, AsPc-1, BxPC-3, OVCAR-3, SKOV3, SW 626 
and PA-1, and from two normal cell lines: human mammary epithelial cells and human 
umbilical vein endothelial cells. 

[0635] A 10 jag aliquot of each genomic DNA sample was digested with EcoR I 
restriction enzyme and a separate 10 jag sample was digested with Hind IH restriction 
enzyme. The restriction-digested DNA samples were loaded onto a 0.7% agarose gel and, 
following electrophoretic separation, the DNA was capillary-transferred to a nylon 
membrane by standard methods (Sambrook, J. et al (1989) Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory). 

PAK5 Amplicon : 

[0636] A 600 base pair fragment (EcoR I - Sac I) of the PAK5 gene was used as a 
template for a radiolabeled DNA probe which was hybridized to the blots at 42 °C for 48 
hours in hybridization solution using standard methods (supra). The blots were exposed to 
a phosphorimager screen for 4 days, then scanned and analyzed using a Molecular 
Dynamics Storm 840 phosphorimager. The relative mass and gene copy number values of 
the PAK5 DNA fragments were calculated from the band density values obtained. The 
blots were re-hybridized with a radiolabeled probe copied from a fragment of human beta 
actin DNA and developed as above to confirm the sample mass loading equivalency. 

RESULTS 
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[0637] The PAK5 gene was determined to exhibit 3 -fold amplification compared 
to the normal DNA copy number in PANC-1 (pancreatic epithelioid carcinoma) and 
OVCAR-3 (ovarian adenocarcinoma) human cell lines, and approximately 2 times the 
normal copy number in the BxPC-3 (primary pancreatic adenocarcinoma) human cell line. 

[0638] Similar Southern analyses can be performed for other STE20 kinases. 

Example 8: Detection Of Protein-Protein Interaction 
Through Phage Display 

Materials And Methods 

[0639] Phage display provides a method for isolating molecular interactions based 
on affinity for a desired bait. cDNA fragments cloned as fusions to phage coat proteins are 
displayed on the surface of the phage. Phage(s) interacting with a bait are enriched by 
affinity purification and the insert DNA from individual clones is analyzed. 

T7 Phage Display Libraries 

[0640] All libraries were constructed in the T7Selectl-lb vector (Novagen) 
according to the manufacturer's directions. 

Bait Presentation 

[0641] Protein domains to be used as baits were generated as C-terminal fusions to 
GST and expressed in E. coli. Peptides were chemically synthesized and biotinylated at 
the N-terminus using a long chain spacer biotin reagent. 

Selection 

[0642] Aliquots of refreshed libraries (10 10 -10 12 pfu) supplemented with PanMix 
and a cocktail of E. coli inhibitors (Sigma P-8465) were incubated for 1-2 hrs at room 
temperature with the immobilized baits. Unbound phage was extensively washed (at least 
4 times) with wash buffer. 

[0643] After 3-4 rounds of selection, bound phage was eluted in 100 |iL of 1% 
SDS and plated on agarose plates to obtain single plaques. 
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Identification of insert DNAs 

[0644] Individual plaques were picked into 25 (aL of 10 mM EDTA and the phage 
was disrupted by heating at 70 °C for 10 min. 2 \xL of the disrupted phage were added to 
50 (iL PCR reaction mix. The insert DNA was amplified by 35 rounds of thermal cycling 
(94oC, 50sec; 50oC, lmin; 72oC, lmin). 

Composition of Buffer 
lOx PanMix 
5% Triton XI 00 

10% non-fat dry milk (Carnation) 

lOmMEGTA 

250 mM NaF 

250 |ag/mL Heparin (sigma) 

250 jag/mL sheared, boiled salmon sperm DNA (sigma) 
0.05% Na azide 
Prepared in PBS 

Wash Buffer 

PBS supplemented with: 

0.5% NP-40 

25 \xl g/mL heparin 

PCR reaction mix 

1 .0 mL 1 Ox PCR buffer (Perkin-Elmer, with 1 5 mM Mg) 
0.2 mL each dNTPs (10 mM stock) 

0.1 mLT7UP primer (15 pmol/|iL) GGAGCTGTCGTATTCCAGTC 

0.1 mLT7DN primer (15 pmol/^L) AACCCCTCAAGACCCGTTTAG 

0.2 mL25 mM MgCl 2 or MgS0 4 to compensate for EDTA 

Q.S. to 10 mL with distilled water 

Add 1 unit of Taq polymerase per 50 |jL reaction 
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PCR reaction mix 

1 .0 mL lOx PCR buffer (Perkin-Elmer, with 15 mM Mg) 
0.2 mL each dNTPs (10 mM stock) 

0.1 mLT7UP primer (15 pmol/uL) GGAGCTGTCGTATTCCAGTC(SEQ ID NO: 146) 

0.1 mLT7DN primer (15 pmol/uL) AACCCCTCAAGACCCGTTTAG(SEQ ID NO:147) 

0.2 mL25 mM MgCl 2 or MgS0 4 to compensate for EDTA 

Q.S. to 10 mL with distilled water 

Add 1 unit of Taq polymerase per 50 uL reaction 

LIBRARY : T7 Select 1-H441 
RESULTS 



Phage display baits and interactors 



Bait 


Domain 


Aa 


Patent 
SEQ ID 


CDNA 
library 


Interactor 


Sequence 
Range 
& SEQ ID 


SULU1 


Coiled-coil2 


752-898 


22 


H441 


GEK2 cc dom(l) 


677-820 
SEQ #26 


SULU3 


Coiled-coil2 


755-898 


23 


H441 


SLK isoform 


M83780 



SULU1 ccl also interacted to a lesser extent with the coiled-coil domain of an SLK isoform. 



[0645] The phage display data suggest potential interactions of SULU3 with SLK 
and SULU1 with GEK2 through their coiled-coil domains. Therefore two members of the 
SULU subfamily of STE20 kinases interact with two members of a separate STE20 
family, the prototype being SLK. 

[0646] These results suggest a specificity in the interaction, and imply that these 
STE20 kinases may interact with each other through homo- and hetero-dimerization. 
Alternatively SULU-related kinases could act immediately up- or down-stream of the 
SLK-related kinases in a signaling cascade. 

[0647] One skilled in the art would readily appreciate that the present invention is 
well adapted to carry out the objects and obtain the ends and advantages mentioned, as 
well as those inherent therein. The molecular complexes and the methods, procedures, 
treatments, molecules, specific compounds described herein are presently representative of 
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preferred embodiments are exemplary and are not intended as limitations on the scope of 
the invention. Changes therein and other uses will occur to those skilled in the art which 
are encompassed within the spirit of the invention are defined by the scope of the claims. 

[0648] It will be readily apparent to one skilled in the art that varying substitutions 
and modifications may be made to the invention disclosed herein without departing from 
the scope and spirit of the invention. 

[0649] All patents and publications mentioned in the specification are indicative of 
the levels of those skilled in the art to which the invention pertains. 

[0650] The invention illustratively described herein suitably may be practiced in 
the absence of any element or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 
"comprising", "consisting essentially of and "consisting of may be replaced with either 
of the other two terms. The terms and expressions which have been employed are used as 
terms of description and not of limitation, and there is no intention that in the use of such 
terms and expressions of excluding any equivalents of the features shown and described or 
portions thereof, but it is recognized that various modifications are possible within the 
scope of the invention claimed. 

[0651] In particular, although some formulations described herein have been 
identified by the excipients added to the formulations, the invention is meant to also cover 
the final formulation formed by the combination of these excipients. Specifically, the 
invention includes formulations in which one to all of the added excipients undergo a 
reaction during formulation and are no longer present in the final formulation, or are 
present in modified forms. 

[0652] In addition, where features or aspects of the invention are described in 
terms of Markush groups, those skilled in the art will recognize that the invention is also 
thereby described in terms of any individual member or subgroup of members of the 
Markush group. For example, if X is described as selected from the group consisting of 
bromine, chlorine, and iodine, claims for X being bromine and claims for X being bromine 
and chlorine are fully described. 

[0653] Other embodiments are within the following claims. 



002.1115275.1 



155 



